I am using django-import-export library to import several excel books. However, I have over 1,000 books that need to be imported into the db. Is there a way to select a folder to upload instead of selecting and uploading each individual file? I've worked through the tutorial found here: https://django-import-export.readthedocs.org/en/latest/getting_started.html#admin-integration
but I was unable to find the answer to my question.
Any help would be greatly appreciated.
Posting mainly for future viewers. Currently, django_import_export imports only the active/first sheet of a single excel workbook. However, the code is easy enough to modify and alleviate this problem. In forms.py, there is ImportForm which is the one used while importing from admin. Simply change the import_file field to something like this:
import_file = forms.FileField(widget=forms.ClearableFileInput(attrs={'multiple':
True}),
label=_('File to import')
)
This form is used in admin.py to process the file data. Change the linked line to something like:
import_files = request.FILES.getlist('import_file')
for import_file in import_files:
...
Now all that's left is to modify the import procedure in base_formats.py for XLS and XLSX formats. The changes will be nearly same for both, I will outline the XLS one here.
Instead of taking the first sheet, run a for loop over the sheets and append the data to the dataset.
dataset = tablib.Dataset()
first_sheet = True # If you keep all correct headers only in first sheet
for sheet in xls_book.sheets():
if first_sheet:
dataset.headers = sheet.row_values(0)
first_sheet = False
for i in moves.range(1, sheet.nrows):
dataset.append(sheet.row_values(i))
return dataset
For XLSX, the loop will run on xlsx_book.worksheets. Rest is similar to xls.
This will allow you to select multiple excel workbooks and import all the sheets for a workbook. I know the ideal solution would be to import a zip file to create all the data using a single bulk_create, but this serves well for now.
Related
Looking to create a spreadsheet of all the file data from a folder of files in Windows 10.
Specifically for this example, the "length" of a group of audio files is desired.
All the methods readily available online, such as "copying file path" or opening the folder in a web browser and copy/pasting the data into a new spreadsheet, seem to allow only for a select few attributes to be captured. Namely, file path, size, and date created.
How can one export other, less common, attributes into spreadsheet form?
Couldn't find a simple answer.
Managed to import the file size using a macro.
From there, used a spreadsheet formula to calculate the length of the audio based on the bitrate.
from google.colab import files
file = files.upload()
df = pd.read_csv('Book 1.xlsx')
df.head(10)
print(df['id'])
When I run this I get a key error for id, which is the first header name in my dataset.
I created this file in numbers by copy pasting text in csv format to a blank numbers spreadsheet and exporting as a csv file. I have "use header names as labels" on in preferences. I have recently gotten my MacBook so there may be some small detail I am overlooking. I have also tried removing footer rows but it isn't working. Please help.
Do you have the head as example because now we are just guessing. And also you are loading a .xlsx file with pd.read_csv while you have the pd.read_excel function in pandas. Maybe that helps?
Also Mohana's answer is worth trying. You are looking on a string and it can be different than you think.
Can anyone please suggest how to import '500112' and 'SBIN' from https://www.moneycontrol.com/india/stockpricequote/banks-public-sector/statebankindia/SBI in Google spreadsheet using importData or importXML functions?
Try using
=IMPORTXML(A1,"//ctag[#class='mob-hide']//span") #where A1 is the url
this should get you both.
adding, for example:
=IMPORTXML(A1,"//ctag[#class='mob-hide']//span[1]")
at the end should output just
500112
Edit:
Since the question was asked, the site started using dynamically loaded data which GS can't handle. Using the tools in your browser's Developer tab, you can find out that the target data is loaded from a different site (see below) and that it is in json format.
So you need to use GS's importJSON() function for that:
A1 = https://priceapi.moneycontrol.com/pricefeed/bse/equitycash/SBI
A2 =importJSON(A1)
Make sure there's enough space on the sheet to expand the output. Once you do, you'll find the two target items under the columns Data Bseid and Data Nseid, probably in columns AL and AM.
This question already has answers here:
Scraping data to Google Sheets from a website that uses JavaScript
(2 answers)
Closed last month.
I am trying to pull a number from the Morningstar "Cash Flow" page an arbitrary stock ticker using XPath. I have the tested the XPath on the morningstar website by an XPath tester and it returned desired values. However, when I want to use this value in a google sheet, it returns #N/A (Imported content is empty.).
=IMPORTXML("http://financials.morningstar.com/cash-flow/cf.html?t=fb®ion=usa&culture=en-US", "//div[#id='data_tts1']/div")
I did a bit of research on this and find out that data in such websites generated dynamically and downloads the content in stages, Therefore, page needs to be loaded first to be able to pull any data out of it!
I'm wondering if there is any solution to this issue?
You help would much be appreciated.
it's empty as it should be because the content you are trying to scrape is of JavaScript origin. Google Sheets does not support imports of JS elements. you can always test this by disabling JS for a given site and only what's left can be scraped:
It might be possible. But you have to prepare a custom sheet to extract the data. Use IMPORTDATA to parse the .json which contains the data :
http://financials.morningstar.com/ajax/ReportProcess4HtmlAjax.html?&t=XNAS:FB®ion=usa&culture=en-US&cur=&reportType=cf&period=12&dataType=A&order=asc&columnYear=5&curYearPart=1st5year&rounding=3&view=raw&r=672024&callback=jsonp1585016592836&_=1585016593002
AFAIK, you couldn't import directly the .csv version (specific headers needed, so curl or other specific tools would be required).
http://financials.morningstar.com/ajax/ReportProcess4CSV.html?&t=XNAS:FB®ion=usa&culture=en-US&cur=&reportType=cf&period=12&dataType=A&order=asc&columnYear=5&curYearPart=1st5year&rounding=3&view=raw&r=764423&denominatorView=raw&number=3
Since this .json is very special (contains html tags), i don't think a custom script for GoogleSheets could import it correctly. So once the .json is loaded in GoogleSheets, TRANSPOSE the rows to columns and use formulas to locate your data (target the cells which contain data_s1 and data_s2 for example). Use CONCAT to merge the cells of interest. Then split the result into columns (use a custom separator). SEARCH for the data you want and clean the results with SUBSTITUTE. The method is dirty but i think it could be automated for the whole process.
I m using “spreadsheet”ruby gem to generate xls files.
I have already an xls file “MyFile.xls” which contains many sheets: sh_01, sh_02, sh_03 …
I want to read the name of the last sheet (sh_last_number) and add a new sheet called “sh_last_number+1” to this file (MyFile.xls) and write some data on it.
In other words, I have to open it (read data) and write on it at the same time.
If this idea can’t be realized with Spreadsheet, is their another gem more efficient?
Thanks in advance.
You can definitely do this with the spreadsheet gem. Since you are working with excel files, you may need to require the excel component of the gem if you are using an older version:
require 'spreadsheet' # You may need to require 'spreadsheet/excel'
Then working with and writing pages is simple. To open the workbook (xls file with multiple pages) you do something like:
#workbook = Spreadsheet.open("MyFile.xls")
And then to add a sheet to the workbook you've opened, you simply:
new_sheet = "sh_#{#workbook.worksheets.size + 1}"
#worksheet = #workbook.create_worksheet(:name => new_sheet)
Hope this helps.
Cheers,
Sean