From time to time I receive questions about large file uploads with ILINX Capture. ILINX Capture can upload files of any size. The limitation is within Internet Information Services(IIS) and or the amount of memory installed in the web server. This is not only true for ILINX Capture, but and ASP or ASP.Net application.

Depending on the architecture of the ASP or ASP.Net application files being uploaded to the web server are typically streamed into the web server’s memory during the upload process before being written to disk. Depending on the number of user concurrently uploading files and the size of the files being uploaded will determine how much physical memory should be installed in the server. By default IIS has a 200KB size limit for uploading a single file. This can be increased, but not any higher than necessary or you may risk overconsumption of the web server’s memory.

Configuring File Upload Size in IIS 6

1. Open Internet Information Services Manager by clicking the Windows Start Menu and Run. Type inetmgr and click OK.

2. Once IIS Manger opens navigate the tree and right click the server name and click properties.

3. From the server properties window check the Enable Direct Metabase Edit checkbox and click OK.

4. Browse to the C:\windows\system32\inetsrv directory and edit the Metabase.xml file with a text editor such as Notepad.

5. Search for the attribute AspMaxRequestEntityAllowed and edit the value to the size in bytes that you want to allow for a maximum upload size. Save and close the Metabase.xml file.

AspMaxRequestEntityAllowed=”204800″

6. Open the Registry editor and navigate to HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\MSSOAP\30\SOAPISAP.

7. Modify the MaxPostSize key. Set the decimal value to the maximum upload size in bytes and click OK.

8. Reboot the web server to ensure the changes have taken effect.

Configuring File Upload Size in IIS 7

1. Open Internet Information Services Manager by clicking the Windows Start Menu and Run. Type inetmgr and click OK.

2. Navigate the tree to the Virtual Directory that you would like to enable large file uploads.

3. In the Features View pane double click ASP.

4. In the ASP setting pane edit the Maximum Requesting Entity and Response Buffering Limit columns. Set this to the maximum file upload size in bytes and click Apply.

 

5. Open the Windows Command Prompt and enter the following command. Change the maxAllowedContentLength to your maximum file upload size in bytes and hit enter to execute the command.

C:\Windows\System32\inetsrv\appcmd set config “Default Web Site” -section:requestFiltering -requestLimits.maxAllowedContentLength:104857600

9. Open the Registry editor and navigate to HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\MSSOAP\30\SOAPISAP.

10. Modify the MaxPostSize key. Set the decimal value to the maximum upload size in bytes and click OK.

11. Reboot the web server to ensure the changes have taken effect.

Bryan Wilhelm
Senior Systems Engineer
ImageSource, Inc.

The feature set in ILINX Capture is vast and it can be a drag reviewing and interpreting feature lists in software documentation.  Those of you not familiar with ILINX Capture can visit the following website www.ilinxcapture.com, or feel free to leave a comment and we can provide additional information and/or a hands-on demonstration.  In short, ILINX Capture is a web based capture platform that excels in distributed capture and custom capture workflow environments.  It is scalable to work on a single workstation or it can be extended to an enterprise wide global standard for capture in your organization.

I wanted to use this post to touch on a couple of the features that I see being used more and more in ILINX Capture.  These features became part of the product based on customer feedback, industry direction, and internal vision for the product.  All of the following features can be added to any point in your process flow map, so it provides not only the functionality but also the flexibility to adapt to the business needs of current processes in place today.

  1. 2D Barcode Support   – This feature adds the ability to read metadata, classify and separate documents, and provide quality control checks through the recognition of 2D barcodes.  Through a GUI the user has the ability to parse the barcode data and map it to fields, separate and identify the type of document, and validate that the number of pages in the document match what was captured through the scanning or electronic import process. 
  2. Web Service Integration  – This feature provides ILINX Capture with the ability to integrate with any existing web service.  Most commonly, we see this used to perform database lookups or validations against existing line of business systems.  Another way this is being utilized is to interact with different organization processes, for example, you can create a support ticket in an organization’s support system every time a process exception occurs in their fully automated capture workflow.
  3. Queue Thresholds & Triggers Work queues in ILINX Capture are areas where human interaction is required to process data or documents through the workflow.  The thresholds and triggers provide the ability to monitor the batches or documents in a queue and execute a function when a threshold or trigger is met.  This is useful to monitor escalations or the processing of high priority documents.  For example, if a fax comes in to the system for an auto loan or stock trade, in most cases, this is a time sensitive process that needs to move rapidly through the workflow.  Between the notification features and the thresholds/triggers, ILINX Capture can ensure that 1) a user is notified that there is high priority work to process, 2) the documents are processed within a defined time frame, and 3) if the documents are not processed the system can notify a manager or route the documents to another user group.

These are just a few of the features that have been added to extend the functionality of this product.  Stay tuned to this blog for additional information on other features that help shape this product to provide value to its customer community. 

Ryan Keller
ImageSource, Inc.

When researching Enterprise Content Management capture projects, the question of handwriting recognition comes up again and again — and many people aren’t sure what to expect.  More commonly, their expectations are unrealistic. They think there is no hope at all, ever. On the other end of the spectrum, some think that tiny fevered cursive scribblings from a rushed meeting can be scanned (or even faxed) and read with accuracy. In helping people think about their forms and the viability of capturing handwriting, I have a few simple guidelines to consider which seem to apply in a majority of cases.

  • Are handwritten forms really the only option?  If the form is available online, can the data be made “fillable” and then submitted directly to your database tables?  Can you let the user fill the form online and print, thus producing machine print and eliminating handwriting?  How about taking the data that a user entered and bar coding it (if the form must be printed rather than be submitted)?  Also helpful and sometimes overlooked:  prefilling form  data from your database through a merge process with a bar code index for retrieval of that same data.
  • Does your Capture software support ICR?  Intelligent Character Recognition (ICR) is what you need to read handwriting.  Optical Character Recognition (OCR) is much more common and is designed to read machine print.  Please don’t try to make it read handwriting – you won’t like the results!
  • Make sure the handwriting is constrained. Annoying? Perhaps. But making the person filling the form write in boxes sets you up for the most successful ICR results.  The catch phrase here could be “Curse the cursive”.  When a character is joined to another character it is faster to write.  However,  the ICR software really struggles to figure out where one character starts and another stops.  And here’s where recognition tanks.   With the real world example below, we can generally expect 100% recognition.

  • Ask for all caps handwriting. You can often tell your ICR engine to look for upper case characters only. This really increases accuracy. And when the form filler forgets to write AS IF SHOUTING, you can often get OK results anyway.
  • Show them how! I know it may seem condescending, but consider this a helpful reminder to those who would blow through the blocks in a mad dash. Show users an example of the way to write in constrained print fields.  And here’s where you can tell them to use all-caps, and show it in your example.

  • Use key index values and database lookups! If there is an employee number, unique phone number, SSN/TaxID, or other unique ID for the person filling the form, use it whenever you can. Then perform a database lookup to confirm identity and optionally populate any other fields that you may need that happen to exist already in your database.
  • Less is More. People burn out on filling lengthy forms using constrained print fields.  Try to minimze the amount the need to write and careless handwriting will decrease.
  • Comb fields can work too. If you think all those constrained print boxes are just too hideous looking, try using comb fields instead. But remember, as soon as people ignore the combs and write cursively or sloppily, ICR results plummet.

  • Use Drop Out Colors for the boxes. If your scanner and ICR software support color dropout technology, you make the ICR engine’s job easier. The boxes aren’t recognized by the scanner, but the handwriting is. So now the constrained print box lines (which make sure each handwritten character is isolated in a target area) don’t have to be considered during ICR.
  • Use OMR bubbles if you really really need perfect index value from handwriting. Remember filling page one of standardized test?  This painful process might be worth it. This is called Optical Mark Recognition.  Since the engine just needs to confirm if a bubble is filled or not, this is easier and more accurate than OCR or ICR.

  • Faxing? Well, OK. But recognition levels will go down.

With these hints in mind, you can look forward to results that are perhaps short of miraculous – that is, less accurate than OCR.  By all means, the results are still worthwhile and produce great time savings when properly implemented.    There are more tricks to describe, which I may save for a later blog.  Please contact ImageSource if you have any questions about capturing handwriting in forms.

First off, ABBYY means “keen eye”, an apt name for a product that dynamically and automatically captures and processes widely disparate documents.  Powerful document recognition separates and classifies docs, and state-of-the art optical character recognition rips the data from the images.  I like the motto that pops up on screen – “take the data, leave the paper”.  I love doing just that, sending paper briskly off  to start its next recycled life.  It’s the greenest thing to do, especially when compared to  filling endless cabinets and long-term off-site storage facilities.

When you want to recommend, sell, support, and solve major customer problems with ECM software at ImageSource, due diligence mandates a thorough feature review and testing.  I’ll describe some of the steps I was involved with in this process for ABBYY FlexiCapture – but mine is but a single slice of the vet team pie.  Development teams and other engineering teams performed specific examinations to answer questions about integration, APIs, and more narrow capabilities to solve unique problems faced by eager customers.  Also, ImageSource staff with a variety of titles took a week-long training course with intensive labs.  Unfortunately I missed the class but was given the opportunity to spin up for a pre-sales demo last year, which was a lot of fun.

So here’s a peek at our process:

 Laptop Install

First things first!  I like to be able to run new software on my laptop whenever possible.  This frees me from all bandwidth and location constraints.  I can easily focus on the vet effort on a plane, down by the river, wherever and whenever.  ABBYY FlexiCapture has a convenient ‘Standalone Installation’ which gives you access to all the key components on one box.

 Obtain Sample Images from Client

In this case we gathered dozens of hardcopy invoices from a large international corporation.  The images were not pretty and included originals, copies, printed faxes, you name it.

 Ascertain Server Needs

After reviewing the ABBYY documentation we set the requirements for our labs – memory per server, disk space, software required, scan station requirements, scanner requirements, and required operating systems.

 Spin Up VMs

Thanks to Mike Peterson we had three servers up in no time.

Convening the Team , Locking Down the ‘War Room’

Gene Eckhart, Jeff Doyle and I  met in our Olympia office for a week.  Gene secured the war room where we periodically met with developers, project managers, engineers, and principals.  Most of the time it was the three of us banging away.

 Lab Software Install

Now we installed ILINX Capture on one server, ABBYY ‘s ‘Distributed Installation’ on another server, and SQL server on the last.   This architecture would mimic what we’d encounter in the field – and also the standalone install wouldn’t cut it as it doesn’t scale and it uses SQL Express as a support database. As installed,  we can easily add more servers for high-volume stress testing.  By running a WebEx all week we were able to record every moment of each day’s work, easily pass the focus from machine to machine, and allow others a view of what we were doing who were remote.  We involved ABBYY tech support when we had a question and felt we could speed up an installation process.  Turns out we could, and it was great to have the technician join our session without delay and see what was up. Also, as we installed we meticulously kept a running log of any issues – however minor – we encountered.  At the end of each day Gene led a review session where we discussed and polished the invaluable ‘Lessons’ doc.

 End-To-End Test

This was our ‘Hello World’ moment – we set up communication between ILINX Capture and ABBYY, and created an appropriate ILINX Capture workflow.  Then we created a simple FlexiLayout, exported it, imported it into FlexiCapture, and created a document definition and an export.  We configured the scanner and the scan station and established we had end-to-end connectivity.

 Building Generic Flexilayouts

One of the many goals of our week was to share baseline knowledge as well as advanced techniques for capturing documents.  We identified  two forms that were relatively easy to identify  and constituted a large amount of the total paper volume.  In short order we had FlexiLayouts and document definitions configured.  Then it was time to tweak and refine.  The ability to chain elements together worked outstandingly – find a keyword, then find the nearest zip code with the help of regular expressions.  Then using out-of-the-box settings we could  find the state, city, address, and addressee.   Wow, powerful.

Building an Uber FlexiLayout

Now it was time to roll the sleeves and build a smarter FlexiLayout that could capture invoices from a variety of sources.  We used advanced features such as FlexiLayout alternatives, element groups, object collection elements, and other settings to start recognizing semi-structured forms from a wide variety of sources.  Then we added a little bit of FlexiLayout language code to help us “crawl” around the identified forms to find dates and monetary amounts that could sometimes be below keywords, or to the right, etc.  We didn’t need to script any validation rules for our purposes, but I showed some script I had created prior  to our meeting .  A quick unit test showed great results – we now had stepped away from a model where each form had to have its own FlexiLayout.

 Running Recognition Tests

We changed our lab coat to testing hazmat suits and ran many batches of documents we had used in development as well as documents we had never looked at before.

 Recording Results

While never a thrill, here we benefitted from a spreadsheet created by Jeff Martin, Gene Eckhardt and  Brandon Konen that allowed easy entry of recognition results.  This is known as our “Advanced Capture Analysis and Comparison Tool”, highly regarded in our ranks.  The data was automatically crunched allowing us to very quickly establish baselines, compare our scan results with other products, share our results with coworker and principals, etc.

Lessons Learned Doc Revisited

It’s a privilege to be able to work with industry veterans such as Jeff Doyle and Gene Eckhardt on a project such as this.  They brought years of experience with them to improve every process we covered.  While evaluating  the Lessons Learned doc, they were able to extrapolate possible impacts in environments and scenarios they have seen in the field.  They also add fresh mitigation alternatives to work through problems encountered.  Our Lessons Learned docs are part of a valuable and large knowledge base that has been added to at ImageSource for year after year.

Findings and Conclusions Write-Up

After a demonstration to some coworkers needing to ramp-up on our configuration, we collaborated to create a summary document and here Gene took the lead.  We were able to draw on the Lessons Learned doc, the Advanced Capture Analysis and Comparison Tool, and meeting notes to piece together our findings and quantify our conclusions.  The summary outlined the scope of our efforts, including excluded activities, our environment and products tested, results, conclusions, general observations, and Best Practice recommendations.

It’s one thing to kick the tires on a car before purchase.  But a methodical, thorough and thoughtful approach is the norm for analogous software tasks at ImageSource.

This example demonstrates how to use a .Net Web Service WebLookup in the ILINX Capture Client.  The sample C# project can be downloaded at http://downloads.ilinxcapture.com/samples/ilinxweblookupsample.zip. You will need to create an ILINX Document Type with at least the following three fields.

  • Client Account Number
  • Document Type
  • Sub Type

This example only returns the following XML String to ILINX Capture to populate the 3 index values.  The XML string is in the same format that was provided to us from the ProcessXML Function parameter IndexXML.  The Value node of the XML is the index data that is then populated in the web client.  This data can be manipulated in many ways before returning to the calling ILINX Capture Web Client.

<ILINX><IndexList><Index><Name>Client Account Number</Name><Label>Client Account Number</Label><ReadOnly>0</ReadOnly><Visible>1</Visible><Value>Client Account Number</Value></Index><Index><Name>Document Type</Name><Label>Document Type</Label><ReadOnly>0</ReadOnly><Visible>1</Visible><Value>Document Type</Value></Index><Index><Name>Sub Type</Name><Label>Sub Type</Label><ReadOnly>0</ReadOnly><Visible>1</Visible><Value>Sub Type</Value></Index></IndexList></ILINX> 

To use the web service it must be published to an IIS Web Server.  Once published to the web server you must access the web service and generate a WSDL file.

1.       Navigate to the new Web Service ASMX file, for example: http://lptbryan/ilinxweblookupsample/ilinxweblookupsample.asmx?wsdl

2.       Save this file to the C:\inetpub\ILINX\QXServices or equivalent in your environment as the filename.WSDL, for example ilinxweblookupsample.WSDL

3.       Open the newly created file with a text editor like notepad and modify the ProcessXMLResponse section.  Chagne ProcessXMLResult to Result.

 

Original WSDL File

 

Modified WSDL File

 

4.       Navigate to the section <wsdl:service name=”Service1″> and change the address location for both port name=”Service1Soap” and port name=”Service1Soap12” to use the DNS or load balanced name of your server if applicable.

 

5.       Add the Web Service WSDL file name to the ILINX Capture Index of choice.

 

6.       Your new web service should now be ready to use.

7.       Log into the ILINX Capture Client and tab out of the Index that was just configured to use a Web Service Lookup.

Before tabbing out of field

 

After tabbing out of field

 

Bryan Wilhelm
Senior Systems Engineer
ImageSource, Inc.

In the years that we have been doing ILINX Capture implementations a common question usually comes up from the IT Administrators; “Do we have to log on to the server to access the administration and management features?”  The answer is: “No, the ILINX Capture Server Manager can be utilized by any remote computer with the proper access.”

The reason why this comes up is because there is a lot of functionality in the ILINX Capture Server Manager.  Not only from the technical administration perspective, but also from the business management side.  For example, you have the ability to monitor the system status, review audit logs, configure security, monitor batches/documents and queues, and much more.  All of this functionality can be individually enabled or disabled for specific users and groups depending on their needs.

The following steps show you how to install the Server Manager for remote access to the software’s administration features;

1.  Run the Software Install and choose “ILINX Server Manager” from the install package…

 

2.  Ensure that the workstation has port access to the ILINX Capture database open to Database.  For example, if you are using SQL Server for your database the default port is 1433.

3.  Provide the user/group with the proper administration access. 

a.  For a Line of Business Manager, it is common to just allow them access to monitor and manage the batches/documents in their queues

b.  For Tech Support, it is common to allow access to the Audit logs and System Monitoring features

This remote administration and management functionality has proven to be a valuable tool for this software over the years.  For more advanced remote admin/management tasks in ILINX Capture, feel free to leave a question in the comments and I will respond.  If enough similar questions are asked, I will drop another post on the more advanced tasks. 

Ryan Keller
ImageSource, Inc.
 

ILINX Product Suite

July 31, 2010

I am not usually out to promote specific products on this blog, but I have been getting really excited about the latest advancements in the ILINX Product Suite.  It is an area that I, among other experienced ECM technologists, have utilized our expertise in creating and refining solutions that can provide real world value for businesses implementing or utilizing ECM solutions.  Take a minute to read this quick post and judge for yourself the value that ILINX Products can provide for your organization.

You may be hearing the word ILINX used in Enterprise Content Management circles more and more these days.  From the humble beginnings of a simple release script connecting a document capture system to an ECM repository the ILINX Product Suite has grown into a set of powerful, easy to use products that provide quick ROI.  There are multiple levels to the ILINX Product Suite ranging from a full blown web client based document capture system (ILINX Capture) or an ECM Repository (ILINX Content Store) to variety of middleware products that can provide time savings and productivity boosting results like ILINX Integrate.

If you are not familiar with all that the Product Suite has to offer, check out the ILINX website for the details and product demos.

-Ryan Keller

Follow

Get every new post delivered to your Inbox.