How to Solve IndexError with xPath - xpath

My apologies, I'm a beginner. I've been trying to get metadata from SEC's website. Here's the link - https://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=0001403161&type=10&dateb=&owner=exclude&count=40
Let's just fetch the dates for now. I'm trying xPath but it's throwing up an IndexError. I checked the fetched html and it does seem to have the data.
My code:
from lxml import html
import requests
page = requests.get('https://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=0001403161&type=10&dateb=&owner=exclude&count=40')
tree = html.fromstring(page.content)
date = tree.xpath('//*[#id="seriesDiv"]/table/tbody/tr[2]/td[4]')[0].text
print(date)
How do I get this to work?
Any help would be greatly appreciated.
Thanks!

Not sure about xpath, as that's how I would have wrote it. But if you don't have to exclusively use xpath, I would go the Pandas route as it parses the whole table, and you can call individual cells if needed:
pd.read_html() will return a list of dataframes (Ie, all the <table> tags in the html). You just need to call the table you want, which in this case is index position 2 (or the last of the 3 dataframes)
import pandas as pd
url = 'https://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=0001403161&type=10&dateb=&owner=exclude&count=40'
dfs = pd.read_html(url)
df = dfs[-1]
Output:
print (df.to_string())
print (df.to_string())
Filings Format Description Filing Date File/Film Number
0 10-Q Documents Interactive Data Quarterly report [Sections 13 or 15(d)]Acc-no:... 2019-07-26 001-3397719978181
1 10-Q Documents Interactive Data Quarterly report [Sections 13 or 15(d)]Acc-no:... 2019-04-26 001-3397719771802
2 10-Q Documents Interactive Data Quarterly report [Sections 13 or 15(d)]Acc-no:... 2019-01-31 001-3397719556097
3 10-K Documents Interactive Data Annual report [Section 13 and 15(d), not S-K I... 2018-11-16 001-33977181189947
4 10-Q Documents Interactive Data Quarterly report [Sections 13 or 15(d)]Acc-no:... 2018-07-27 001-3397718974910
5 10-Q Documents Interactive Data Quarterly report [Sections 13 or 15(d)]Acc-no:... 2018-04-27 001-3397718783872
6 10-Q Documents Interactive Data Quarterly report [Sections 13 or 15(d)]Acc-no:... 2018-02-01 001-3397718567042
7 10-K Documents Interactive Data Annual report [Section 13 and 15(d), not S-K I... 2017-11-17 001-33977171209440
8 10-Q Documents Interactive Data Quarterly report [Sections 13 or 15(d)]Acc-no:... 2017-07-20 001-3397717974492
9 10-Q Documents Interactive Data Quarterly report [Sections 13 or 15(d)]Acc-no:... 2017-04-21 001-3397717774258
10 10-Q Documents Interactive Data Quarterly report [Sections 13 or 15(d)]Acc-no:... 2017-02-02 001-3397717568413
11 10-K Documents Interactive Data Annual report [Section 13 and 15(d), not S-K I... 2016-11-15 001-33977162000223
12 10-Q Documents Interactive Data Quarterly report [Sections 13 or 15(d)]Acc-no:... 2016-07-25 001-33977161782265
13 10-Q Documents Interactive Data Quarterly report [Sections 13 or 15(d)]Acc-no:... 2016-04-25 001-33977161589237
14 10-Q Documents Interactive Data Quarterly report [Sections 13 or 15(d)]Acc-no:... 2016-01-28 001-33977161369122
15 10-K Documents Interactive Data Annual report [Section 13 and 15(d), not S-K I... 2015-11-20 001-33977151244628
16 10-Q Documents Interactive Data Quarterly report [Sections 13 or 15(d)]Acc-no:... 2015-07-23 001-33977151002526
17 10-Q Documents Interactive Data Quarterly report [Sections 13 or 15(d)]Acc-no:... 2015-04-30 001-3397715819049
18 10-Q Documents Interactive Data Quarterly report [Sections 13 or 15(d)]Acc-no:... 2015-01-29 001-3397715559143
19 10-K Documents Interactive Data Annual report [Section 13 and 15(d), not S-K I... 2014-11-21 001-33977141240400
20 10-Q Documents Interactive Data Quarterly report [Sections 13 or 15(d)]Acc-no:... 2014-07-24 001-3397714991576
21 10-Q Documents Interactive Data Quarterly report [Sections 13 or 15(d)]Acc-no:... 2014-04-24 001-3397714781985
22 10-Q Documents Interactive Data Quarterly report [Sections 13 or 15(d)]Acc-no:... 2014-01-30 001-3397714558846
23 10-K Documents Interactive Data Annual report [Section 13 and 15(d), not S-K I... 2013-11-22 001-33977131236561
24 10-Q Documents Interactive Data Quarterly report [Sections 13 or 15(d)]Acc-no:... 2013-07-24 001-3397713983884
25 10-Q Documents Interactive Data Quarterly report [Sections 13 or 15(d)]Acc-no:... 2013-05-01 001-3397713803519
26 10-Q Documents Interactive Data Quarterly report [Sections 13 or 15(d)]Acc-no:... 2013-02-06 001-3397713578037
27 10-K Documents Interactive Data Annual report [Section 13 and 15(d), not S-K I... 2012-11-16 001-33977121209935
28 10-Q Documents Interactive Data Quarterly report [Sections 13 or 15(d)]Acc-no:... 2012-07-27 001-3397712990778
29 10-Q Documents Interactive Data Quarterly report [Sections 13 or 15(d)]Acc-no:... 2012-05-02 001-3397712805918
30 10-Q Documents Interactive Data Quarterly report [Sections 13 or 15(d)]Acc-no:... 2012-02-08 001-3397712582250
31 10-K Documents Interactive Data Annual report [Section 13 and 15(d), not S-K I... 2011-11-18 001-33977111214519
32 10-Q Documents Interactive Data Quarterly report [Sections 13 or 15(d)]Acc-no:... 2011-07-29 001-3397711996223
33 10-Q Documents Interactive Data Quarterly report [Sections 13 or 15(d)]Acc-no:... 2011-05-05 001-3397711815087
34 10-Q Documents Interactive Data Quarterly report [Sections 13 or 15(d)]Acc-no:... 2011-02-02 001-3397711566916
35 10-K Documents Interactive Data Annual report [Section 13 and 15(d), not S-K I... 2010-11-19 001-33977101205707
36 10-Q Documents Interactive Data Quarterly report [Sections 13 or 15(d)]Acc-no:... 2010-08-02 001-3397710982428
37 10-Q Documents Interactive Data Quarterly report [Sections 13 or 15(d)]Acc-no:... 2010-05-03 001-3397710789509
38 10-Q Documents Interactive Data Quarterly report [Sections 13 or 15(d)]Acc-no:... 2010-02-03 001-3397710571090
39 10-K Documents Interactive Data Annual report [Section 13 and 15(d), not S-K I... 2009-11-20 001-33977091198831
To print an individual row and column:
print (df.loc[0,'Filing Date'])
2019-07-26

This approach will return the whole column - filing data as a list,
page = requests.get('https://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=0001403161&type=10&dateb=&owner=exclude&count=40')
tree = html.fromstring(page.content)
Firstdate = tree.xpath('//table[#class="tableFile2"]//tr[2]/td[4]/text()')
print(Fristdate)
Alldates = tree.xpath('//table[#class="tableFile2"]//tr/td[4]/text()')
print(Alldates)
Output:
['2019-07-26', '2019-04-26', '2019-01-31', '2018-11-16', '2018-07-27', '2018-04-27', '2018-02-01', '2017-11-17', '2017-07-20', '2017-04-21', '2017-02-02', '2016-11-15', '2016-07-25', '2016-04-25', '2016-01-28', '2015-11-20', '2015-07-23', '2015-04-30', '2015-01-29', '2014-11-21', '2014-07-24', '2014-04-24', '2014-01-30', '2013-11-22', '2013-07-24', '2013-05-01', '2013-02-06', '2012-11-16', '2012-07-27', '2012-05-02', '2012-02-08', '2011-11-18', '2011-07-29', '2011-05-05', '2011-02-02', '2010-11-19', '2010-08-02', '2010-05-03', '2010-02-03', '2009-11-20']

Related

Calculate total time based on column in Tableau

I have a table like below:
From Date
Issue Id
Issue Id (group)
Status
Till Date
19-07-2021 17:21
4
4
Approved
19-07-2021 17:23
19-07-2021 17:23
4
4
In Progress
19-07-2021 17:23
19-07-2021 17:23
4
4
In Review
19-07-2021 17:25
19-07-2021 17:25
4
4
In Progress
19-07-2021 18:56
19-07-2021 18:56
4
4
In Review
20-07-2021 08:47
20-07-2021 08:47
4
4
Resolved
20-07-2021 14:45
20-07-2021 14:45
4
4
Closed
12-07-2021 10:49
4
4
Open
19-07-2021 17:21
27-04-2016 09:07
3
3
Open
10-01-2017 08:40
10-01-2017 08:40
3
3
Closed
10-01-2017 08:40
3
3
Resolved
10-01-2017 08:40
I need to do the following things:
For Issue Id 4 find the total time in hours or minutes or seconds or days for a particular type of status. For e.g There are 2 In Review rows. So the total time between From Date to Till date will be 17:23 (19-07) till 8:47(20-07).
calculate total time a issue is in between closed and In Review (here Till date for closed issues is unfortunately null).
Basically I am trying to create a dashboard where for each issue i'd i would like to see for how long was a issue "In Review" or "In Progress" before it was closed. So the dashboard will have "Issue Id" in the X axis and "Total Time for Review" or "Total Time for Progress" in the Y axis. For e.g Issue 4 was in a total of 1:31:01 Hours in the "In Progress" state (17:23 to 17:23 on 19th July and 17:25 to 18:56 on 19th July).
I am trying this:
IF [STATUS] = 'In progress' and [STATUS] = 'Closed'
THEN
DATEDIFF('day',[Date Create],[Till Date])
END but it says tables can only be aggregated and using Count function only.
Can someone please help? How can we create a calculated field for the above scenarios.
Think of your IF statements being applied to each row, you cannot have a status that is both in progress and closed.
I would arrange the text table like this:
Columns: Status
Rows: Issue ID (group) | Issue ID
Text Mark: Calculated Field (Named something like Total Time).
That will group all of the statuses together. You can change the aliases of the status if you want to say "Total Time for ..."
Then your calculated field would be:
DATEDIFF("day", [From Date], [Till Date])
And make sure you drag the pill over it is summing it. That will collapse everything at the status level, and then total the days.

oracle Unable to create AWR report

oracle version: 11.2.0.1.0
When I try to generate an AWR Report in Oracle use sysdba,
exec dbms_workload_repository.create_snapshot();
I keep getting this error:
ORA-00600: [kewrose_1], [600], [ORA-00600: [13013], [5001], [6213], [8465936], [5], [8447794], [17], [], [], [], [], []
], [], [], [], [], [], [], [], [], []
I am searching for a long time on net. But no use. Please help or try to give some ideas how to achieve this
conn / as sysdba
SQL> #$ORACLE_HOME/rdbms/admin/awrrpt.sql
SQL> #$ORACLE_HOME/rdbms/admin/awrrpt.sql
Current Instance
DB Id DB Name Inst Num Instance
----------- ------------ -------- ------------
3400239050 TEP 1 TEP
Specify the Report Type
AWR reports can be generated in the following formats. Please enter the
name of the format at the prompt. Default value is 'html'.
'html' HTML format (default)
'text' Text format
'active-html' Includes Performance Hub active report
Enter value for report_type: dec
Type Specified: dec
Instances in this Workload Repository schema
DB Id Inst Num DB Name Instance Host
------------ -------- ------------ ------------ ------------
* 3400239050 1 TEPCBC TEP oldman
Using 3400239050 for database Id
Using 1 for instance number
Specify the number of days of snapshots to choose from
Entering the number of days (n) will result in the most recent
(n) days of snapshots being listed. Pressing without
specifying a number lists all completed snapshots.
Entering the number of days (n) will result in the most recent
(n) days of snapshots being listed. Pressing without
specifying a number lists all completed snapshots.
Enter value for num_days: 1
Listing the last day's Completed Snapshots
Snap
Instance DB Name Snap Id Snap Started Level
TEP TEP 38542 20 Dec 2022 00:30 1
38543 20 Dec 2022 01:30 1
38544 20 Dec 2022 02:30 1
38545 20 Dec 2022 03:30 1
38546 20 Dec 2022 04:30 1
38547 20 Dec 2022 05:30 1
38548 20 Dec 2022 06:30 1
38549 20 Dec 2022 07:30 1
38550 20 Dec 2022 08:30 1
38551 20 Dec 2022 09:30 1
38552 20 Dec 2022 10:30 1
38553 20 Dec 2022 11:30 1
38554 20 Dec 2022 12:30 1
38555 20 Dec 2022 13:30 1
38556 20 Dec 2022 14:30 1
38557 20 Dec 2022 15:30 1
38558 20 Dec 2022 16:30 1
38559 20 Dec 2022 17:30 1
38560 20 Dec 2022 18:30 1
38561 20 Dec 2022 19:30 1
38562 20 Dec 2022 20:30 1
Specify the Begin and End Snapshot Ids
Enter value for begin_snap: 38542
Begin Snapshot Id specified: 38542
Enter value for end_snap: 38544
End Snapshot Id specified: 38544
Specify the Report Name
~~~~~~~~~~~~~~~~~~~~~~~
The default report file name is awrrpt_1_38542_38544.html. To use this name,
press <return> to continue, otherwise enter an alternative.
Enter value for report_name: test2
Using the report name test2
<html lang="en"><head><title>AWR Report for DB: TEP, Inst: TEP, Snaps: 38542-38544</title>
<style type="text/css">
body.awr {font:bold 10pt Arial,Helvetica,Geneva,sans-serif;color:black; background:White;}

Finding Max Weekly Average on a month

I am stuck on my query attempt. I have a table that lists test results with their dates. I need to run a query to return the highest weekly average for a particular month.
I have the first part figured out:
SELECT Effluent BOD5, WEEK(Date)
FROM bod
WHERE YEAR(Date) = 2020 AND MONTH (Date) = 4
ORDER BY WEEK(Date)
Returns:
Effluent BOD5 / WEEK(Date)
10 14
14 14
9 15
6 16
7 16
11 17
8 17
I need to get the result of 12 (which is the highest weekly average (week 14).
Any help would be great![enter image description here][1]
I messed around with this and figured it out! Here is what I used:
SELECT max(Total)
FROM
(SELECT week, avg(test) AS Total
From
(SELECT Effluent BOD5 test, WEEK(Date) week
FROM bod
WHERE YEAR(Date) = 2020 AND MONTH(Date) = 4
ORDER BY WEEK(Date),Effluent BOD5 desc)ab
GROUP BY week)ac

Adding new column and inserting values to an exisitng excelsheet using SSIS/SSRS

I have an existing excel template as like below :
DATE 7/28/2016 7/29/2016 July MTD YTD
Call Activity
IB_Calls_Offered 22 29 52 52 52
IB_Calls_Answered 22 29 52 52 52
Sale 6 3 9 9 9
Everyday when my SP get executed , I need to add a extra column (as Current Date) and fill the corresponding data.
Kindly Suggest me how to do it by using SSIS/SSRS

Processing Timebased values

I have a list of timebased values in the following form:
20/Dec/2011:10:16:29 9
20/Dec/2011:10:16:30 13
20/Dec/2011:10:16:31 13
20/Dec/2011:10:16:32 9
20/Dec/2011:10:16:33 13
20/Dec/2011:10:16:34 14
20/Dec/2011:10:16:35 6
20/Dec/2011:10:16:36 7
20/Dec/2011:10:16:37 16
20/Dec/2011:10:16:38 5
20/Dec/2011:10:16:39 7
20/Dec/2011:10:16:40 15
20/Dec/2011:10:16:41 12
20/Dec/2011:10:16:42 13
20/Dec/2011:10:16:43 11
20/Dec/2011:10:16:44 6
20/Dec/2011:10:16:45 7
20/Dec/2011:10:16:46 9
20/Dec/2011:10:16:47 14
20/Dec/2011:10:16:49 6
20/Dec/2011:10:16:50 11
20/Dec/2011:10:16:51 15
20/Dec/2011:10:16:52 10
20/Dec/2011:10:16:53 16
20/Dec/2011:10:16:54 12
20/Dec/2011:10:16:55 8
The second column contains value against each second. Values are there for complete month and for each and every second. I want to add these values:
Per minute basis. [for 00 - 59 seconds ]
Per hour basis [ for 00 - 59 minutes ]
Per Day basis. [ for 0 - 24 hours ]
Sounds like a job for Excel and a pivot table.
The trick is to parse the text date/time you have into something Excel can work with; splitting it on the colon will do just that. Assuming the value you have is in cell A2, this formula will convert the text into a real date:
=DATEVALUE(LEFT(A2,SEARCH(":",A2)-1))+TIMEVALUE(RIGHT(A2,LEN(A2)-SEARCH(":",A2)))
Then just create Minute, Hour and Day columns where you subtract out that portion of the date. For example, if the date from the above formula is in C2, the following will subtract out the seconds and give you just up to the minute:
=C2-SECOND(C2)/24/60/60
Then repeat the process for the next two columns to give you the hour and the day:
=D2-MINUTE(D2)/24/60
=E2-HOUR(E2)/24
Then all you have to do is create a pivot table on the data with rows Day, Hour, Minute and value Sum(Value).

Resources