|
| Check out some of our other articles! Metal Homes - a Canadian
Engineer's Experience
|
Electronic Data Capturing for Large Format Documents Wilfred Siu, Ph.D., P.Eng. The first step in integrating paper with CAD and Engineering Records Management Systems (ERM) is data capturing. This involves Scanning; Raster Cleanup / Editing; and Raster / Vector Conversion. Following these data capturing/editing steps comes data integration into the Document Management and/or Work Flow systems. It is the first steps in data capturing / editing that we are mainly concerned in this synopsis. (Readers interested in the broader issues of Engineering Records Management are refered to an excellent paper titled, "How to Integrate Paper with CAD", by David J. Wilson. Copies of the paper are available from our office. Call for your free copy.) SCANNING The first and critical step in paper integration is scanning. A good scan is critical for follow-up image cleanup, editing and vectorization. Good separation of text, quality line representation and smooth raster geometry are important aspects of a good scan, whereas the mere resolution number (usually in dpi dots per inch) does not tell the whole story. Complicating the selection of a scanner is manufacturers claims to superior proprietary "thresholding technology", which, in theory at least, automatically adjusts the scanners settings for optimal scan results. More telling is the fact that all scanners provide manual override of their automatic settings. Also, most large format scanners use multiple lenses to capture the full width of a document, and rely on software to adjust for any misalignment resulting from multi-lens scanning. However, single lens scanners are available, and theoretically should result in better alignment. The best test of a scanner is through testing it with your own documents ranging over a wide media and quality spectrum. RASTER EDITING Even with the best scanner, the resulting raster images often require cleanup for best viewing quality. Most common raster image cleanup involves deskewing and despeckling. Many raster editing applications are available. Most are global in scope, meaning that they provide one-step deskewing or one-step despeckling process covering the entire document. While a one-step process may sound attractive, it does not provide the user-interfacing necessary in documents of any reasonable complexity. For example, a one-step despeckling will remove all speckles below a specified size from the entire drawing; but, it may also remove decimal points in critical dimensions. More high-end raster editing applications provide both one-step global raster editing as well as targeted (selective) editing features allowing for user interfacing. In some high-end raster editing applications, the raster images take on vector-like properties (e.g. raster objects like lines, arcs and circles, etc.), and includes features like "object-snap". VECTORIZATION Most entry level vectorization applications provide what is in essence overhead digitizing. A scanned image is brought up on screen, and theuser "traces" over the images with CAD entities. This is at best pseudo-vectorization. It is labor intensive, especially when large portions of the drawing has to be vectorized. High-end vectorization applications allow for groups of image objects to be vectorized automatically. Raster objects may be grouped globally as in the entire drawing, or selectively by windowing or other user-specified commands. Thus, as in raster-editing applications, vectorization applications basically fall into two classes: global vectorization and interactive vectorization. Software that allow for one-step conversion of entire drawings suffer from the same drawbacks as their raster editing counterpart. An interactive vectorization software package allows the user to interact with the software, maximizing the utilities of both the user and the technology. Interaction between the user and the system can help develop advanced features in "expert systems" like trainable ICR (Intelligent Character Recognition) engines. Raster / vector conversion can be done in part or in full, and may be done in batch runs or interactively with user-interfacing. Batch runs works best when drawing quality is good and the drafting style is consistent. Results of batch conversions normally require a high level of quality audit. Text Conversion Most vectorization softwares still do not provide good text conversion. It is often advisable (and, in many applications, necessary) to leave out the text in vectorization, and input the text over the image counterpart directly from the CAD application. ICR (Intelligent Character Recognition) technology is improving daily. High-end vectorization softwares now often provide for ICR. However, because of the inherent difficulty in handwriting recognition, most ICR engines are of dubious value if no mechanism is available to "train" the system to recognize handwritten characters and even word strings. Such features are available in the top end vectorization software. Partial vs Full Vectorization Partial vectorization results in a hybrid drawing. Drawings are scanned in, cleaned up and stored in raster (usually done by an outside service bureau). Parts of the drawing may be vectorized to allow for modifications in CAD. When plotted out on hard copies, the hybrid (raster/vector) drawing is indistinguishable from a full CAD version. Partial vectorization is gaining popularity as more and more people discover the CADOVERLAY capability in AutoCAD 14, which allows for one-at-a-time object picking in vectorization an intermediate level between overhead digitization and true vectorization. While partial vectorization may fulfil many needs, there are many reasons for full vectorization of raster images, not the least of which is user reluctance to deal with hybrid drawings. In addition, in many modelling and manufacturer applications, only fully vectorized documents are acceptable. Full conversion may be done in batch runs, or for individual documents allowing for full user intervention. WORK-STREAM IN DATA CAPTURING While a data capturing program is a specific organization has to be tailor-designed to account for the strengths and limitations of the organization, a typical work stream may resemble the following: Preliminary Document Data Base A preliminary document database is built, wherein documents are grouped in their natural order, e.g. by projects or by facilities. Document Assembly for Scanning Documents are grouped according to media, document size, quality, etc. so as to minimize scanner setting. Scanning & Raster Cleanup / Editing Scanning and raster editing should be done in close proximity, so that rescanning, if necessary, can be initiated when too much cleanup is required. Updating Data Base The document database is updated with details from the drawings. The level of details required for the database depends on the requirements of the Engineering Records Management (ERM) system, and can be down to equipment hierarchy level.
Batch Vectorization (Excluding Texts) Raster drawings are grouped for batch vectorization in over night runs. It is suggested that a batch run covers only as many drawings as can be cleaned up in the next shift. Vectorization Cleanup Quality assurance is provided through interactive conversion and cleanup. This includes text conversion. VIEWING Scanned-in drawings are often integrated into an Engineering Records Management (ERM) system for information distribution. The drawings are often used in conjunction with documents from other sources. Not all users of the ERM system need to modify the drawings, while all would require the capability to view, and sometimes, annotate the drawings and documents in the system. Many viewing softwares are available, some of which are free. However, since end-user frustrations are often the cause of a systems failure to gain acceptance, it is absolutely necessary to invest in a good viewing software for the system. A good candidate for a viewing software must be simple and intuitive to use, be capable of viewing multi-format documents, have redlining and annotation capabilities, provide seamless integration with other peripherals such as printer/plotter, and can be easily incorporated as part of the overall ERM system. Sometimes, the ERM system requirements alone dictate the choice of the viewing software. Benefits of Electronic Data Capturing Data capturing is the first step in creating an ERM system. As such, the full benefits of data capturing cannot be discussed in isolation. However, even by itself, electronic data capturing offers indisputable benefits far beyond mere economics. Electronic data capturing
Our drawing conversion clients report up to 50% cost savings in producing new drawings based on existing paper documents.
Multiple copies of documents stored on different sites is the only guaranty of data security for mission critical documents against natural and man-made disasters. Duplication of documents for multi-site storage is economically feasible only with electronic documents. Electronic data capturing provides the first critical step in achieving data security. |
copyright 1999 by Merwin Engineering Ltd. All Rights Reserved
Please e-mail merwin@constructworld.com if you have any comments or suggestions.