Uploading Large Files with ILINX Capture and IIS
October 12, 2011
From time to time I receive questions about large file uploads with ILINX Capture. ILINX Capture can upload files of any size. The limitation is within Internet Information Services(IIS) and or the amount of memory installed in the web server. This is not only true for ILINX Capture, but and ASP or ASP.Net application.
Depending on the architecture of the ASP or ASP.Net application files being uploaded to the web server are typically streamed into the web server’s memory during the upload process before being written to disk. Depending on the number of user concurrently uploading files and the size of the files being uploaded will determine how much physical memory should be installed in the server. By default IIS has a 200KB size limit for uploading a single file. This can be increased, but not any higher than necessary or you may risk overconsumption of the web server’s memory.
Configuring File Upload Size in IIS 6
1. Open Internet Information Services Manager by clicking the Windows Start Menu and Run. Type inetmgr and click OK.
2. Once IIS Manger opens navigate the tree and right click the server name and click properties.
3. From the server properties window check the Enable Direct Metabase Edit checkbox and click OK.
4. Browse to the C:\windows\system32\inetsrv directory and edit the Metabase.xml file with a text editor such as Notepad.
5. Search for the attribute AspMaxRequestEntityAllowed and edit the value to the size in bytes that you want to allow for a maximum upload size. Save and close the Metabase.xml file.
AspMaxRequestEntityAllowed=”204800″
6. Open the Registry editor and navigate to HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\MSSOAP\30\SOAPISAP.
7. Modify the MaxPostSize key. Set the decimal value to the maximum upload size in bytes and click OK.
8. Reboot the web server to ensure the changes have taken effect.
Configuring File Upload Size in IIS 7
1. Open Internet Information Services Manager by clicking the Windows Start Menu and Run. Type inetmgr and click OK.
2. Navigate the tree to the Virtual Directory that you would like to enable large file uploads.
3. In the Features View pane double click ASP.
4. In the ASP setting pane edit the Maximum Requesting Entity and Response Buffering Limit columns. Set this to the maximum file upload size in bytes and click Apply.
5. Open the Windows Command Prompt and enter the following command. Change the maxAllowedContentLength to your maximum file upload size in bytes and hit enter to execute the command.
C:\Windows\System32\inetsrv\appcmd set config “Default Web Site” -section:requestFiltering -requestLimits.maxAllowedContentLength:104857600
9. Open the Registry editor and navigate to HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\MSSOAP\30\SOAPISAP.
10. Modify the MaxPostSize key. Set the decimal value to the maximum upload size in bytes and click OK.
11. Reboot the web server to ensure the changes have taken effect.
Bryan Wilhelm
Senior Systems Engineer
ImageSource, Inc.
Tuning Abbyy FlexiCapture Layouts and Document Definitions
September 16, 2011
So you have spent many hours analyzing and creating the layouts and definitions for the documents you need to be processed through Abbyy. Now you should be almost ready for production, except you need to tune. Many samples of the documents in question need to be run through and the results checked over very carefully to find and fix all the little issues that will be present.
Tuning involves not finding the bugs in your definitions but finding the little differences in the printed documents that are processed. These differences may be due to printing offsets on the printed form that is then run through the printer where the actual data to extract is found. In addition, there can be other cases where the Header or Footer elements are not extracted correctly. All these differences can add up to Abbyy not detecting the correct document definition to apply to the scanned images.
In order to correct these issues a very careful analysis of results need to be viewed through the Design Studio. Import the document in question into the Studio and then process it. Look carefully at what was missed. Many times it is due to the Search Area not being large enough to cover all the letters/numbers to be extracted. Also, within a group the required and option flags have a lot to do with if the group is found or not. All it takes is one search element within the group that is not found and the entire group may be marked as not found, so be sure to check them over the flags carefully.
There are going to be times with multiple Document Definitions that a specific document does not match the definition it should have, but some other definition. This can be caused by the error percentage on the wrong document definition to be set too high a value when both document definitions share a similar field to extract. To fix this just take the error percentage down a few points and try the recognition again.
It takes a lot more effort to tune a document definition especially when dealing with multiple document definitions and paper documents that are difficult to scan in with enough clarity for the OCR engine to work properly. This is very true for Transcript type documents where each transcript has its own copy protection mechanism that the scan software must try and compensate. However it works out, so be prepared to spend the time and effort to get the document definitions to the point where they work most of the time.
Christopher J. Hillenburg
Senior System Engineer
ImageSource, Inc.



