Power Query won't read from .xls files - powerquery

I am using Office 2010. I have a query that combines data from several excel files in a folder. ".xlsx" files load fine, but when a ".xls" file exists in the folder, the query will not run (Gives error message: "Data could not be retrieved from database". In the query editor, when I click on the row for the file with an error, I see the message here: Error Message ). Resaving the files to ".xlsx" works, but I'd rather be able to use them as-is.
I have installed the MS Access Database Engine here: http://www.microsoft.com/en-us/download/details.aspx?id=13255 but it doesn't seem to help.
Any other ideas? Thanks!
Edit: Added the two queries. First is the query applied to each file, second is the query that combines them.
Query "Transform Sample File from Supplier CMRTs":
let
Source = Excel.Workbook(#"Sample File Parameter1", null, true),
#"Smelter List_Sheet" = Source{[Item="Smelter List",Kind="Sheet"]}[Data],
#"Removed Top Rows" = Table.Skip(#"Smelter List_Sheet",3),
#"Promoted Headers" = Table.PromoteHeaders(#"Removed Top Rows", [PromoteAllScalars=true]),
#"Removed Other Columns" = Table.SelectColumns(#"Promoted Headers",{"Smelter Identification Number Input Column", "Metal (*)", "Smelter Look-up (*)", "Comments"}),
#"Filtered Rows" = Table.SelectRows(#"Removed Other Columns", each [#"Metal (*)"] <> null and [#"Metal (*)"] <> "")
in
#"Filtered Rows"
Query "Supplier CMRTs":
let
Source = Folder.Files("O:\Supplier CMRTs"),
#"Invoke Custom Function1" = Table.AddColumn(Source, "Transform File from Supplier CMRTs", each #"Transform File from Supplier CMRTs"([Content])),
#"Filtered Rows" = Table.SelectRows(#"Invoke Custom Function1", each [Extension] <> ".txt"),
#"Renamed Columns1" = Table.RenameColumns(#"Filtered Rows", {"Name", "Source.Name"}),
#"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform File from Supplier CMRTs"}),
#"Expanded Table Column1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Transform File from Supplier CMRTs", Table.ColumnNames(#"Transform File from Supplier CMRTs"(#"Sample File")))
in
#"Expanded Table Column1"

I found that when I combine binaries, if I select the Sample Binary Parameter instead of a Sheet, and work my way from there, it will not balk at xls vs xlsx files. But before I could even get to the point where I could combine binaries for the folder, I had to filter only to xlsx files. Therefore, after I successfully combine the binaries, I have to go back to the Applied Steps and remove the one where I filtered only to xlsx files.
Here are some step-by-step with screen clips:
I started with 4 Excel Sheets in one Folder, called New Folder:
Here's what their data looks like:
Establish a new source from folder. Do not click Combine & Edit. Click the Edit button:
Filter the Extension column to only xlsx files:
Right-click on the column name for the Content column and then click Remove Other Columns, so you'll only have a Content column:
Click to combine the binaries. Then click the folder level Sample Binary Parameter and click OK:
Go to your Applied Steps and remove the Filtered Rows step, where you filtered to only xlsx files: Change...
to...
Also remove the Changed Type step from the Applied Steps, because it now won't work and isn't needed.
Now your query should work with both your xlsx and xls files.
For completeness, here's what I have at this step (all 4 of my files each have only one sheet, called Sheet1 in each, which is why you see 4 Sheet1 names):
Anyhow, the names dont matter for me, so I delete the Name column and expand the Data column to get:
You should recognize the data as the data from all 4 sheets above.

Related

Consolidating Excel files in Power Query

I have followed online instruction that allow me to do this if all of the files that I need are in the same folder.
Whilst all of the files that I need are indeed in my "Downloads" folder, there are other files in there as well.
Where I am getting lost is in editing my list of the files that I want.
When I press edit it takes me to Power Query (where I can filter the files that I want) but when I press Load and save it just creates a list of those files and doesn't allow me to then combine and edit.
I just end up with a list of the files that I wanted to combine!
Could anybody point me in the right direction please?
Below is some code to combine all XLS files in a directory. Add your code starting in the third row to further filter the filenames as needed, then let the rest of the steps proceed to combine them
let Source = Folder.Files("C:\subdirectory\directory"),
#"Filtered Rows" = Table.SelectRows(Source, each ([Extension] = ".xlsx")),
// add another filter here as desired
#"Removed Other Columns" = Table.SelectColumns(#"Filtered Rows",{"Name", "Content"}),
#"Added Custom" = Table.AddColumn(#"Removed Other Columns", "GetFileData", each Excel.Workbook([Content],true)),
#"Expanded GetFileData" = Table.ExpandTableColumn(#"Added Custom", "GetFileData", {"Data", "Hidden", "Item", "Kind", "Name"}, {"Data", "Hidden", "Item", "Kind", "Sheet"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded GetFileData",{"Content", "Hidden", "Item", "Kind"}),
List = List.Union(List.Transform(#"Removed Columns"[Data], each Table.ColumnNames(_))),
#"Expanded Data" = Table.ExpandTableColumn(#"Removed Columns", "Data", List,List)
in #"Expanded Data"
This might help you work with your info a bit.
To be able to select the files you want to use from the folder, what you want to do is, when you go to get data from folder, click the Transform Data button instead of the pre-selected Combine or Combine & Transform Data button. (Which of the pre-selected buttons you'll see depends upon how you navigated to get data from folder.) After you click the Transform Data button, you'll be presented a table of your folder's contents, with various columns of pertinent info. You can then filter based on the info in the columns and, after filtering, you can click on the button to combine the contents of the filtered files.

Is there any way how to refer actual column name as variable in Power Query

I am new to Power Query and I would like to ask more experienced people about it.
I am trying to solve problem with Text.Combine where I would like to combine together value in specific column and column name of actual cell.
Do you have any idea?
My idea is formula similar to this:
=Text.Combine({[kod], Column.Name},";")
Thank you very much for answer.
Tomas
Edit 8.12.2021:
#horseyride
I actually try fill columns automatically with data in following format where first part will be value from actual row and actual column name:
For e.g.:
8M0183:F01A0101.B in first row, second column,
8M0182:F01A0102.A in second row, first column
Table example mentioned bellow.
Thank you very much for all answers.
see if this works for you. It combines the KOD column with the column name into each null cell, for every column that is not named KOD
It finds all column not named KOD. It converts those to text. It replaces all nulls with a placeholder, here "xxx". We unpivot to get everything to just three columns. We then combine the column title and the cell contents if the cell contents is equal to the placeholder. Pivot to get back to original format
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
ColumnList=List.Difference(Table.ColumnNames(Source),{"KOD"}),
ConvertToText= Table.TransformColumnTypes (Source,List.Transform(ColumnList, each {_ , type text})),
ReplaceNullsWithPlaceholder = Table.ReplaceValue(ConvertToText,null,"xxx",Replacer.ReplaceValue,ColumnList),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(ReplaceNullsWithPlaceholder, {"KOD"}, "Attribute", "Value"),
CombineColumnAndRow = Table.AddColumn(#"Unpivoted Other Columns", "Value2", each if [Value]="xxx" then [Attribute]&":"&[KOD] else [Value]),
#"Removed Columns" = Table.RemoveColumns(CombineColumnAndRow,{"Value"}),
#"Pivoted Column1" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[Attribute]), "Attribute", "Value2")
in #"Pivoted Column1"

Bring Value with Sumifs in Pow.Query language to specified row, and column(location)

Next step? I have brought with sumifs and a lot sumif from other workbook, information to the exact row, column in excel workbook. Now I want to do the same with query language. I can bring two values if condition is met, but then it is unclear how I will bring the total sum to the one row in excel workbook. Can anyone show me the path? I guess I will need Data Model...
= Table.AddColumn(#"Changed Type", "Sumif", each if [Column2] =2 or [Column2]=1 then [Column3]+[Column4] else 0)
let
Source = Folder.Files...
#"C:\Users...
#"Imported Excel" = Excel.Workbook(#"C:\...
SegPL_Chart = #"Imported Excel"{[Name="SegPL_Chart"]}[Data],
#"Removed Top Rows" = Table.Skip(SegPL_Chart,12),
#"Removed Alternate Rows" = Table.AlternateRows(#"Removed Top Rows",1,1,90),
#"Promoted Headers" = Table.PromoteHeaders(#"Removed Alternate Rows"),
#"Filtered Rows" = Table.SelectRows(#"Promoted Headers", each ([Col1]="1" or [Col1]="2")),
#"Table Group = Table.Group(#"Filtered Rows", {}, List.TransformMany(Table.ColumnNames(#"Filtered Rows",(x)=>{each if x = "Names" then "Totals" else List.Sum(Table.Column(_,x))},(x,y)=>{x,y})),
#"append" = Table.Combine({#"Filtered Rows",#"Table Group"})
in
#"append"
It gives an error "in" Token comma needed..? What else I need to do bring total rows?
You can use several steps to create several helper columns with intermediate results of conditional sums. Then you can create a new column, sum up all the intermediate results and the delete the helper columns with the intermediate results.
Keep in mind that unlike Excel, the calculations in Power Query always return constants and you can then delete calculated columns you no longer need. So,
Create helper column 1 with complicated IF and Sum scenario
Create helper column 2 with complicated IF and Sum scenario
Create total column to add column 1 + column 2
Delete helper columns and keep only the total column
That gives me exact result what I was looking for, but it is with DAX formula in PowerPivot:
=SUMX(FILTER('TableName',[ColName] = 1),'TableName'[ColName2])
So would be glad to convert it to Power-Query formula

Populate conditional column depending on column name criteria

I receive a weekly report which contains some repetition of columns. This is because it is drawn from a collection of web forms which ask similar questions to each other - let's say they all ask "Do you want to join our email list?" - but this question is stored in the source system as a separate field for each form (each form is effectively a separate table). The columns will always be consistently named - e.g. "Email_optin_1", "Email_optin_2" - so I can come up with rules to identify the columns which ask the email question. However, the number of columns may vary from week to week - one week the report might just contain "Email_optin_2", the next week it might include four such columns. (This depends on which web-forms have been used in that week). The possible values are the same in all these columns - let's say "Yes" and "No".
Each row should normally only have one of the "Email_optin" columns populated.
What I would like to do is create a single column in Power Query called "Email_Optin_FINAL", which would return "Yes" if ANY columns beginning with "Email_optin" contain a value of "Yes".
So, basically, instead of the criteria simply referring to the values in specific columns, what I would like it to do is first of all figure out which columns it needs to be looking at, and then look at the values in those columns.
Is this possible in PowerQuery?
Thanks in advance for any advice!
This would find all the columns containing Email_optin and merge them for you into a new column and remove the original columns
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
EmailList= List.Select(Table.ColumnNames(Source), each Text.Contains(_, "Email_optin")),
#"Merged Columns" = Table.CombineColumns(Source,EmailList,Combiner.CombineTextByDelimiter("", QuoteStyle.None),"Merged")
in #"Merged Columns"
This would find all the columns containing Email_optin and merge them for you into a new column and preserve the original columns
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
Index= Table.AddIndexColumn(Source, "Index", 0, 1),
EmailList= List.Select(Table.ColumnNames(Index), each Text.Contains(_, "Email_optin")),
Merged = Table.CombineColumns(Index,EmailList,Combiner.CombineTextByDelimiter("", QuoteStyle.None),"Merged"),
#"Merged Queries" = Table.NestedJoin(Index,{"Index"},Merged,{"Index"},"Merged",JoinKind.LeftOuter),
#"Expanded Table2" = Table.ExpandTableColumn(#"Merged Queries", "Merged", {"Merged"}, {"Merged"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Table2",{"Index"})
in #"Removed Columns"
you can then filter for "YES" among the merged answers if you want

Inserting text manually in Powerquery

I'm merging multiple Excel files into one where the user can review and mark an additional Comment column as completed. Each day there are additional files and I need to refresh the query and pull the data in. Keeping the original Comment column values.
I've attempted to do this by referencing Marcel Beug's video but that uses an sql table and I cannot seem to get it to work with the Excel files as the source.
After the Merge Queries I attempt to modify the first file to my source "InputFile"
![Modify the Merge Formula1][2]
![Changed to last query step of InputFile][3]
![InputFile Query with Source2 and Merge][4]
![M Code of InputFile Query with Merge][5]
By setting the First field in the Merge Formula to the last step in the InputFile query I was able to get around the Cyclic error however I find that every Refresh creates duplicate rows. 4 become 8 that then becomes 16, etc.
let
Source = Excel.Workbook(File.Contents("S:\Fin_Aid\Operations Team\COD mpn - lec\InputFiles\8.22.18 to 8.23.18.xlsx"), null, true),
Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
Rename_RecID = Table.RenameColumns(#"Removed Columns",{{"Column3.1", "RecID"}}),
Source2 = Excel.CurrentWorkbook(){[Name="InputFile"]}[Content],
InputWithComment = Table.TransformColumnTypes(Source2,{{"RecID", Int64.Type}, {"Column1", type text}, {"Column2", type text}, {"Column4", type text}, {"Column5", type text}, {"Comment", type text}}),
#"Merged Queries" = Table.NestedJoin(Rename_RecID,{"RecID"},InputWithComment,{"RecID"},"InputWithComment",JoinKind.LeftOuter),
#"Expanded InputWithComment" = Table.ExpandTableColumn(#"Merged Queries", "InputWithComment", {"Comment"}, {"Comment"})
in
#"Expanded InputWithComment"
Regards,
Jim

Resources