Create your first automation in just a few minutes.Try Studio Web →
by YouTube
6
Activity
361
Summary
Summary
An activity pack used to read and process extra large excel files.
Overview
Overview
Objectives:
Observation and limitations:
The activity will not keep track of the formats used to display values in the Excel appliation. For example, a cell with a value of 43006 formatted as date will be displayed as 9/28/2017 in Excel, but our activity will still retrieve it as: 43006.
Files in which the row starts with the index 0 (this index can be found in the xml markup of the row) will not be processed. The row number also needs to be under 1048576 (the maximum number of rows in an Excel Sheet). If a row number outside of this interval is found, the application will throw an exception. This limitation is also available for the Read Range in the Workbook activities package.
This package is not compatible with .xls files or .csv files. Only .xlsx files can be read with this activity. Reading Large XLS files can be done in conjunction with the Conversions Package which will convert the file to an xlsx file, thus making it compatible with this package.
Some very minor differences when deciding where a range ends between the standard Excel processing activities and mine
Technical Approach:
The files will be processed in the background using the Open XML SDK library and the SAX (Simple API for XML) approach which instead of reading the entire document and exposing its DOM, will independently read the XML files that comprise the Excel document.
Read Range
Principles
This custom activity reads a Range of cells from an excel Sheet has.
This custom activity achieves its goal by using the Open XML SDK framework and the OpenXmlReader class to parse the File in a SAX like approach. To get the number of rows, we simply parse the file and count the rows.
The custom activity is created by simply inheriting the CodeActivity class from the Windows Workflow Foundation framework.We define the logic of the function by implementing the Execute method and we define the input and output of the activity by fields of the InArgument and OutArgument types.
This activity can only be placed inside an XL Excel Application Scope activity (not necessarily directly inside, but one of its parents must be an XL Excel Application Scope).
NOTE: Installation Guidelines for Custom Activities can be found here
Features
Features
This approach works only on xlsx files. Using SAX, the xml content of the file is loaded gradually, this means that the content is retrieved from the file without overloading the RAM memory or needing to have Microsoft Office installed. XLS files can be processed too, but they need to be converted to xlsx first. The package has been successfully tested with files sizes of +200MB and over 1 million rows. There's no reason this solution could not read even bigger files, however, for this it would be needed to read the file in chunks.
Additional Information
Additional Information
Dependencies
DocumentFormat.OpenXML (> = 2.5.0) Microsoft.Office.Interop.Excel (> = 15.0.4795.1000)
Code Language
Visual Basic
Runtime
Windows Legacy (.Net Framework 4.6.1)
Technical
Version
0.0.6754.28931Updated
February 18, 2020Works with
Compatible with all versions of UiPath Studio
Certification
Silver Certified
Support
UiPath Community Support