1/3/2024 0 Comments Blue prism stock discussion![]() If you have all the information in Excel files, you can use the Oledb connector to query both Excel file. When I execute the command however, what was a file with ~530 lines gets reduced down to a file with 499 lines? The two files originally were exact copies of each other, each had one field renamed to 'THIS IS A TEST' and 'THIS IS ANOTHER TEST' accordingly, so I would expect all other rows (which have an exact match) to be removed. I've merged the two spreadsheets together into a single one, saved it, and have the OLEDB connection pointing at this input file. The connection's current state is closed.ĮDIT: OK, ignore the above! I managed to get the connection established with a Connection string, although the 'distinct' SQL command seems very hit and miss, so I wonder if I'm doing anything incorrectly. Internal : Could not execute code stage because exception thrown by code stage: ExecuteNonQuery requires an open and available Connection. Having just had a little play with it, I configured 'Set Connection' just with the filepath to the Excel document in the Database field as mentioned above before opening a connection - both of these were stepped over succesfully, but then when I tried to execute after this, I was presented with the following: Having never used OLEDB previously however, would you mind guiding me through it a bit? I'm assuming I first need to run 'Set Connection' from the object - is the Database field merely the path to the Excel document? Also unsure what I would put in the 'Provider' and 'Connection String' fields. Thanks a lot for this Pablo, this looks very interesting. Subject: Looking for Efficiencies - Remove Duplicates from Collection This obviously is not feasible with this seemingly exponential growth in processing power/time required.Īre there any existing code stages that might make this more manageable whilst maintaining accuracy? The other thought that occurred to me is to split the collections into batches of 100 rows or so, collect the unique data from those and then re-process these 'unique data' collections to get down to a final data set, although this feels very inefficient as well. ![]() ![]() The biggest collection it will ultimately need to parse through is about 1,800 rows. ![]() I've just given up on waiting for it to do a collection with 827 rows after about an hour. For 278 rows, it takes about 6 minutes, whereas for 400 rows it took about 15 minutes. Admittedly, this is taken from stepping over in Debug mode, but for two collections with 117 rows, the time taken is about a minute. However, at the moment, it is only seemingly feasible for fairly small spreadsheets due to the time taken to execute. ![]() Now this works perfectly well in terms of accuracy. Here's the flow, in case I haven't described that very well: If there was no match, write those values to a new collection - this would be my 'unique data'. If there was a match, remove the row and go to the next entry. So my second thought was to create a nested loop of the two collections (that contain the data from the spreadsheets) and do a comparison on the values of both (there are 20 columns per row to check against). My first thought was to gather them as collections through the Excel VBO and then Remove Duplicates via Collection Manipulation, but this had a poor success rate. I've been tasked with tracking differences between two different Excel files for reporting purposes. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |