MarketplaceStudioActivityRead Extra Large SpreadSheets

Create your first automation in just a few minutes.Try Studio Web

Read Extra Large SpreadSheets

Read Extra Large SpreadSheets

by YouTube

Star

6

Activity

Downloads

361

back button
back button
carouselImage0
next button
next button

Summary

Summary

An activity pack used to read and process extra large excel files.

Overview

Overview

Objectives:

  • Providing a robust and facile manner of processing very large excel files in the background and without relying on COM Interop.
  • The implementation of custom activities that will give the user the last row of the excel file (extremely useful when trying to process a file in chunks). Currently trying to read a file in chunks is impossible without knowing the number of rows and columns in the sheet.
  • The implementation of an Application Scope activity that will allow the grouping of the custom activities in this package.
  • Providing activities that are similar in structure and in the type of parameters to the current Excel application

Observation and limitations:

The activity will not keep track of the formats used to display values in the Excel appliation. For example, a cell with a value of 43006 formatted as date will be displayed as 9/28/2017 in Excel, but our activity will still retrieve it as: 43006.

Files in which the row starts with the index 0 (this index can be found in the xml markup of the row) will not be processed. The row number also needs to be under 1048576 (the maximum number of rows in an Excel Sheet). If a row number outside of this interval is found, the application will throw an exception. This limitation is also available for the Read Range in the Workbook activities package.

This package is not compatible with .xls files or .csv files. Only .xlsx files can be read with this activity. Reading Large XLS files can be done in conjunction with the Conversions Package which will convert the file to an xlsx file, thus making it compatible with this package.

Some very minor differences when deciding where a range ends between the standard Excel processing activities and mine

Technical Approach:

The files will be processed in the background using the Open XML SDK library and the SAX (Simple API for XML) approach which instead of reading the entire document and exposing its DOM, will independently read the XML files that comprise the Excel document.

Read Range

Principles

This custom activity reads a Range of cells from an excel Sheet has.

This custom activity achieves its goal by using the Open XML SDK framework and the OpenXmlReader class to parse the File in a SAX like approach. To get the number of rows, we simply parse the file and count the rows.

The custom activity is created by simply inheriting the CodeActivity class from the Windows Workflow Foundation framework.We define the logic of the function by implementing the Execute method and we define the input and output of the activity by fields of the InArgument and OutArgument types.

This activity can only be placed inside an XL Excel Application Scope activity (not necessarily directly inside, but one of its parents must be an XL Excel Application Scope).

NOTE: Installation Guidelines for Custom Activities can be found here

Features

Features

This approach works only on xlsx files. Using SAX, the xml content of the file is loaded gradually, this means that the content is retrieved from the file without overloading the RAM memory or needing to have Microsoft Office installed. XLS files can be processed too, but they need to be converted to xlsx first. The package has been successfully tested with files sizes of +200MB and over 1 million rows. There's no reason this solution could not read even bigger files, however, for this it would be needed to read the file in chunks.

Additional Information

Additional Information

Dependencies

DocumentFormat.OpenXML (> = 2.5.0) Microsoft.Office.Interop.Excel (> = 15.0.4795.1000)

Code Language

Visual Basic

Runtime

Windows Legacy (.Net Framework 4.6.1)

Publisher

YouTube

Visit publisher's page

Trusted Source

License & Privacy

License Agreement

Privacy Terms

Technical

Version

0.0.6754.28931

Updated

February 18, 2020

Works with

Compatible with all versions of UiPath Studio

Certification

Silver Certified

Application

Microsoft Excel

Support

UiPath Community Support

Similar Listings