Document Extraction Error

Gotcha | Posts: 44

Document Extraction Error

Wednesday, August 16, 2017 at 07:38am

Hi, I have 3 forms. Each form has Signature Capture Fields Example SignerId1.Capture1, SignerId2.Capture2, [ExtraSignerId1.Capture1],[ExtraSignerId2.Capture2]...etc When I print the form individually as 3 packages, it is created and email is triggered to complete the signature ceremony. However, when I bundle the 3 PDF together ( via a 3rd party tool in my application ), it creates 1 PDF combining all 3 forms with their respective Signature Capture fields. The response returned is an error? Note whether I create 3 packages of each 1 form or 1 package with 3 forms, the request json is exactly the same which defines the signers/roles...etc The Error I get when i combine the 3 pdf in one package is the following:

{"messageKey":"error.validation.extractingFields","packageId":null,"technical":"Package: 75028f88-8743-4e86-8744-90579ddec7a7 Document: 0ed89a90e9b9aa48, Errors: [Could not find extract info role: ExtraSignerId4(2), Could not find extract info role: ExtraSignerId3(3), Could not find extract info role: ExtraSignerId3(2), Could not find extract info role: PSignerId1(2), Could not find extract info role: PSignerId2(2), Could not find extract info role: SignerId2(2), Could not find extract info role: ExtraSignerId1(2), Could not find extract info role: ExtraSignerId1(3), Could not find extract info role: ExtraSignerId2(2), Could not find extract info role: ExtraSignerId5(2), Could not find extract info role: ExtraSignerId2(3), Could not find extract info role: SignerId1(2), Could not find extract info role: SignerId2(3), Could not find extract info role: SignerId1(3), Could not find extract info role: ExtraSignerId6(2), Could not find extract info role: ExtraSignerId4(3), Could not find extract info role: CSignerId1(2), Could not find extract info role: CSignerId2(2)]","entity":null,"message":"Unable to extract PDF fields from this document.","code":400,"name":"Validation Error"}

Can you tell me what I am doing wrong?

August 16 Created

January 21 Last Updated

7 years ago Last Reply

9 Replies

213 Views

2 Users

0 Likes

0 Links

harishaidary | Posts: 1812

Reply to: Document Extraction Error

Thursday, August 17, 2017 at 05:48am

Hi Cyril, Do you mind sharing your document if it doesn't contain any sensitive information on it?

harishaidary | Posts: 1812

Reply to: Document Extraction Error

Thursday, August 17, 2017 at 11:02am

Well the problem I am seeing here is that when you are merging your PDFs, your form field names get renamed to [ExtraSignerId1(1).Capture1], [ExtraSignerId1(2).Capture1], [ExtraSignerId1(3).Capture1], etcâ€¦ Of course, these signers donâ€™t exist in your package (correct me if Iâ€™m wrong here). Normally, the application should ignore these fields as I was able to run this sample code:

EslClient eslClient = new EslClient(info.API_KEY_SANDBOX, "https://sandbox.esignlive.com/api");
             
             DocumentPackage pkg = PackageBuilder.newPackageNamed("doc extract " + new Date())
                           .withSigner(SignerBuilder.newSignerWithEmail("[email protected]")
                                        .withFirstName("John")
                                        .withLastName("Smith")
                                        .withCustomId("CSignerId1"))
                           .withDocument(DocumentBuilder.newDocumentWithName("sample form")
                                        .fromFile("C:/Users/hhaidary/Desktop/Merge_3forms.pdf")
                                        .enableExtraction())
                           .build();
             
             PackageId packageId = eslClient.createAndSendPackage(pkg);
             
             System.out.println(packageId);

Gotcha | Posts: 44

Reply to: Document Extraction Error

Friday, August 18, 2017 at 06:38am

After following up with the 3rd party we use which combines the individual pdf to one pdf, they confirmed this is the expected behaviour :

If you are merging multiple PDFs which are having form fields with same name then DynamicPDF Merger product will rename the form field names while merging to have unique from field names in the output PDF. It is expected behaviour as per the PDF specifications. It is suggested to have unique form field names in the PDF.
 
There are no options available in DynamicPDF Merger API  to preserve the duplicate form field names in the merged output PDF

So now I've hit a road block, and need your advice with regards to applying Signature Fields on the Final Output when multiple PDF comes in play - which will be a common occurence. To recap the challenge I'm having now: 1. The Lender will generate a list of PDF to be signed. This list of PDF derives from several individually templated PDF with FieldNames - which DynamicPDF uses via meta data to map each of the Field Names with data from our application. In addition, The Field names which are not part of the meta data mapping are open fields at the end of the process - the final generated pdf and this is then uploaded to ESL to create the package. The Signature Fields are part of the non mapped field for DynamicPDF and contains Field Names for placeholders on the PDF where signatures are required [SignerId1.Capture1], [SignerId1.Capture2] ..etc which respect the UNIQUE attribute on all PDF form individually. 2. With ESL API, We use the document extraction method and on each individual PDF ( the original source templates) , we added the same Field Name where the Primary Applicant is required to sign. eg [SignerId1.Capture1], [SignerId1.Capture2] and when we submit to ESL , the corresponding JSON payload containing those FIeld Names are passed. Sending Packages to ESL of individual forms works . 3. When generating Multiple PDF is combined by the DynamicPDF, the latter will parse the final combined PDF and make sure the FIeld Names are unique - thus appending the incremental (#) for each duplicate it finds while parsing it top down and therefore changing the Field Names breaks the logic where ESL is trying to look for the defined Fields in the JSON payload... generating an error on your end. So on one end (ESL) To define Signature Fields, I need to have a predefined placeholder with a hardcoded name. [SignerId1.Capture1] If the same Signature is required on another part of the PDF or on another PDF , they should be the same name [SignerId1.Capture1] but incremental SigStyle#. This is respected in all PDF individually. On the other end (DynamicPDF) When the PDF is combined into one, the set up i have will no longer work as the UNIQUE ATTRIBUTE is not respected when combined. So DynamicPDF will append (#) to preserve UNIQUENESS. This is what causes the JSON will no longer match the SIgnature Field Names on the Final PDF (after it was processed with DynamicPDF) How can I work around this problem other than creating individual packages for each PDF ? Our current workflow does not allow this without drastic change in our set up on how PDF are bundled and printed at once ( the conventional way to be paper signed)... in the digital way, we're hoping to bundle them in one PDF and have the bundle signed by different parties which appears on the combined PDF.

harishaidary | Posts: 1812

Reply to: Document Extraction Error

Friday, August 18, 2017 at 08:33am

Hi Cyril, In your case, I would suggest using Text Tags instead of document extraction. Essentially, this will remove the need of form fields as you will simply use text tags to position signatures and fields in your documents. You will also not run into any issues when merging your documents.

Gotcha | Posts: 44

Reply to: Document Extraction Error

Friday, August 18, 2017 at 10:15am

Interesting... I'm reading about the Text Tag feature but I'm not completely clear. I'm using a PDF, so for Document Extract I use a Text Field with the name in the specified convention you use [SignerId1.Capture1] So If I understand correct, for Text Tag, I need the name to have the format {{esl:role:fieldType}} but how is that different with the Document Extract, since I still need to make sure the Name is UNIQUE? Your sample code share is using a word document... would you have a sample with the PDF... i just need to know if you use a Text Field Control on the PDF ( which is invisible when the PDF is viewed or printed) or a plain text edit on the PDF directly - which is visible when viewed in a reader or printer? If it is a text string {{esl:role:fieldType}} on the PDF, this will have cause another challenge for me as the same PDF template is used for clients who will be using ESL and those who don't. So the end result for the latter, they will see the strings {{esl:role:fieldType}} in the template pdf after it is merged but not processed to ESL. So polluting the actual contract.

Gotcha | Posts: 44

Reply to: Document Extraction Error

Thursday, August 24, 2017 at 06:57am

Hi Haris, The Text Tags Feature will not work for us as the same set of form templates ( which are all PDF) will be used for both types of Clients ; those who enable Digital Signature Feature via ESL and those who does not have this feature. As such having, "tags" which appears on the Form is not an acceptable solution for those who does not have a Digital Signature Feature within their set up. So going back to our Document Extraction, I tried respecting the UNIQUENESS in the Text Field Names by putting a unique # for the Capture token in the tag. This way it was successfully sent to ESL without any error. However, the end result is not what I expected. What I mean, let's say each of my my PDF form has a UNIQUE Integer Id. So, what I'm doing I'm appending this IntegerId with the token for ESL Document Extraction; Assuming I am printing a bundle consisting of forms 1111, 2222, 3333 and within those 3 forms, I have a signature section where 2 applicants signature are required at the end of each of the form. So I'm having Form 1111: [SignerId1.Capture1111], [SignerId2.Capture1111] Form 2222: [SignerId1.Capture2222], [SignerId2.Capture2222] Form 3333: [SignerId1.Capture3333], [SignerId2.Capture3333] So I expected to have 6 signature Controls in the bundle of 3 forms where all 3 signature controls which starts with SignderId1 will be prompted for a Signature from the 1st signer ( defined in the JSON ) and all those with SIgnerId2 will require 3 signature from my 2nd Signer in the JSON. For some reason, only the 1st occurrence is detected and have a Signature Control... [SignerId1.Capture2222] and [SignerId1.Capture3333] is NOT detected... Is this a bug ? Similarly, for Signer 2, only the 1st occurrence is detected and have aSignature Control...[SignerId2.Capture2222] and [SignerId2.Capture3333] is NOT detected Nevermind, It is still renaming with a suffix (#) even though my Id is unique... I'm going to investigate further with the 3rd party this odd behaviour. However, if you have any other pointers on my challenge described above, please let me know.

Gotcha | Posts: 44

Reply to: Document Extraction Error

Thursday, August 24, 2017 at 09:38am

Here's an update: I've recreated the POC which had only 1 PDF with 3 pages with multiple occurences of the Signature of the same person. [SignerId1.Capture1], [SignerId1.Capture2] Previously, ESL will prompt 2 controls to be signed... However, the latest release (im pointing in e-signlive.ca), this has changed (compared to esignlive.com) and I can only obtain 1 Signature Control ( the last one )... which is the same behaviour I've encountered above . So I'm stuck and cannot proceed with providing the complete solution from our end- where a bundle of forms are combined and sent as one package so that each pdf within the bundle extracts and create the signature control by your API.

harishaidary | Posts: 1812

Reply to: Document Extraction Error

Friday, August 25, 2017 at 07:41am

Hi Cyril, If I'm understanding this correctly, you're now having issues with only 1 document? Can you send that document to me to my email and I will have a look.

Gotcha | Posts: 44

Reply to: Document Extraction Error

Monday, August 28, 2017 at 05:52am

Here's an update of this original ticket: 1. Document Extraction was returning an error because the Field Names on the bundled PDF ( multiple forms) was being renamed by another 3rd party when merging data- appending (#) in the names to make sure it is unique . 2. Going back to a single Form PDF works well as long as the Field Names have uniquely Field Name property on the PDF template. 3. Given my use case real life scenario, for an application we bundle multiple pdf ( each pdf individually works as it respect the UNIQUE naming convention but since each PDF have the same SIgnature Roles - when combined it is no longer UNIQUE ). I'm now stuck with a challenge and require suggestion on how to work around this problem.

As far as I know, the textfields need to be unique when using document extraction. So having textfields with #1,#2,#3, etc.. will not work. To me, it looks like an issue with your 3rd party library that youâ€™re using, which you will need to debug with them. How about you try naming your textfields this way: Form1: [SignerId1.Capture1] [SignerId1.Capture2] [SignerId1.Capture3] Form2: [SignerId1.Capture4] [SignerId1.Capture5] [SignerId1.Capture6] Form3: [SignerId1.Capture7] [SignerId1.Capture8] [SignerId1.Capture9] Doing Document extraction with all 3 documents merged or individually will work.

Thanks for your quick response, yes you are right, they have to be unique Fieldnames even on a single form. So this brings me back to my POC which was done originally with 1 form and making sure each occurence of the SIgnature Field for a particular Role had the child name Capture# where # is incrementally set for each occurence on the form. So going back to my real case scenario, where multiple forms which exist as stand alone PDF templates , although they have uniquely identified Signature Control individually, when it is "Appended" by the 3rd party to make 1 package will rename the fields when it is duplicated ( This will always result in duplicate field names as each form will technically involve the same Roles which require a signature ). As I am appending all pdf forms into 1 big pdf form, this technique does not seem to work because of the nature of the fields being renamed. From an API perspective, this was perfect as I only need to create a big package for each Loan and issue one call to ESL. The other obvious solution , given this constraint, is to individually creating a package for each pdf form for one Applicant - this will result to multiple calls to create individual packages and become cumbersome with all the emails generated for a single loan. The follow up as well will become very impractical for the sender... Is there any other solution you can propose to me given my use case? Thanks