azure databricks count rows in all tables - is there a better way - azure-databricks

I'm trying to find the best way to get row counts for all my databricks tables. This is what I came up with:
for row in dvdbs.rdd.collect():
tmp = "show tables from " + row['databaseName'] + " like 'xxx*'"
if row['databaseName'] == 'default':
dftbls = sqlContext.sql(tmp)
else:
dftbls = dftbls.union(sqlContext.sql(tmp))
tmplist = []
for row in dftbls.rdd.collect():
tmp = 'select * from ' + row['database'] + '.' + row['tableName']
tmpdf = sqlContext.sql(tmp)
tmplist.append((row['database'], row['tableName'],tmpdf.count()))
columns = ['database', 'tableName', 'rowCount']
df = spark.createDataFrame(tmplist, columns)
display(df)

I found this to be significantly faster...
dftbl = sqlContext.sql("show tables")
dfdbs = sqlContext.sql("show databases")
for row in dfdbs.rdd.collect():
tmp = "show tables from " + row['databaseName']
if row['databaseName'] == 'default':
dftbls = sqlContext.sql(tmp)
else:
dftbls = dftbls.union(sqlContext.sql(tmp))
tmplist = []
for row in dftbls.rdd.collect():
try:
tmp = 'select count(*) myrowcnt from ' + row['database'] + '.' + row['tableName']
tmpdf = sqlContext.sql(tmp)
myrowcnt= tmpdf.collect()[0]['myrowcnt']
tmplist.append((row['database'], row['tableName'],myrowcnt))
except:
tmplist.append((row['database'], row['tableName'],-1))
columns = ['database', 'tableName', 'rowCount']
df = spark.createDataFrame(tmplist, columns)
display(df)

You can also try using this:-
def fn_byDBgetCount():
final_list = []
dbList = spark.sql("show databases").select("namespace").rdd.flatMap(lambda x: x).collect()
for databaseName in dbList:
spark.sql("use {}".format(databaseName))
tableList = spark.sql("show tables from {}".format(databaseName)).select("tableName").rdd.flatMap(lambda x: x).collect()
for tableName in tableList:
tableCount = spark.sql("select count(*) as tableCount from {}".format(tableName)).collect()[0][0]
final_list.append(list([databaseName,tableName,tableCount]))
column_names = list(['DatabaseName','TableName','TableCount'])
df = spark.createDataFrame(final_list,column_names)
display(df)
fn_byDBgetCount()

Related

probleme de renitialisation d'un QtreeWidget

I am developing an application that allows to place orders with python and QtDesigner. I can't manage to place two commands in a row. The first command passes without any problem but when I want to place another command without closing the application, this error is displayed: "self.ui.treeWidgetcommand.topLevelItem(self.Line ).setText(0, str(Id))
AttributeError: 'NoneType' object has no attribute 'setText'".
def AddCommande(self):
QtWidgets.QTreeWidgetItem(self.ui.treeWidgetcommande)
Libelle = self.ui.comboBoxproduit.currentText()
Qte = int(self.ui.lineEditQteproduit.text())
Info = self.stock.GetProductName(Libelle)[0]
Id = str(int(Info[0]))
Pu = Info[1]
Total = int(Qte)*int(Pu)
data=(Libelle,Qte,Id,Pu,Total)
#print(data)
self.ui.treeWidgetcommande.topLevelItem(self.Ligne).setText(0, str(Id))
self.ui.treeWidgetcommande.topLevelItem(self.Ligne).setText(1, str(Libelle))
self.ui.treeWidgetcommande.topLevelItem(self.Ligne).setText(2, str(Qte))
self.ui.treeWidgetcommande.topLevelItem(self.Ligne).setText(3, str(Pu))
self.ui.treeWidgetcommande.topLevelItem(self.Ligne).setText(4, str(Total))
self.Ligne +=1
def ValiderCommande(self):
Client = self.ui.comboBoxclient.currentText()
IdClient = self.stock.GetClientIdByName(Client.split(" ")[0])
PrixTotal = 0
UniqueId = random.random()
Date = date.today()
Data = (IdClient,PrixTotal,Date,UniqueId)
if self.stock.AddCommande(Data) == 0:
for i in range(self.Ligne):
IdCommande = self.stock.GetClientIdByUniqueId(UniqueId)
Libelle = self.ui.treeWidgetcommande.topLevelItem(i).text(1)
IdProduit = self.ui.treeWidgetcommande.topLevelItem(i).text(0)
Pu = self.ui.treeWidgetcommande.topLevelItem(i).text(3)
Qte = self.ui.treeWidgetcommande.topLevelItem(i).text(2)
Total = int(self.ui.treeWidgetcommande.topLevelItem(i).text(4))
InfoData = (IdCommande, Libelle, Qte, Pu, Total)
data = (Qte,IdProduit)
if self.stock.AjoutInfoCommande(InfoData) == 0:
PrixTotal += Total
self.stock.UpdateQteStock(data)
if self.stock.UpdateCommande(PrixTotal,IdCommande) == 0:
self.ui.treeWidgetcommande.clear()
#self.ui.treeWidgetcommande.topLevelItem(self.Ligne).setHidden(True)
self.ui.lineEditQteproduit.setText(" ")
`
I would like after placing an order, reset my treeWidget array and be able to place other orders.

Struggleing to validate a user entry in tkinter

Here is part of some code i create for a project in tkinter using sqlite3 as a database in python. Im trying to make it so that when a user enters their values into the entry fields it only accepts integer values, and tried to implement this into the validation function. Ive tried using the try and except method, but this still seems to allow all values to be added to the table. How else could i attempt to make this work?
def validation (self):
try:
int(self.inc.get()) and int(self.out.get()) == True
except ValueError:
self.message['text'] = 'Value must be a number!'
def adding (self):
if self.validation:
query = 'INSERT INTO data VALUES (?,?)'
parameters = (self.inc.get(), self.out.get())
self.run_query (query, parameters)
self.message ['text'] = 'Record [] added' .format (self.inc.get ())
self.inc.delete (0, END)
self.out.delete (0, END)
else:
self.message['text'] = 'Income or outgoing field is empty'
self.viewing_records()
def deleting (self):
self.message ['text'] = ''
try:
self.tree.item(self.tree.selection ()) ['values'][0]
except IndexError as e:
self.message['text'] = 'Please, select record!'
return
self.message['text'] = ''
Income = self.tree.item (self.tree.selection ()) ['text']
query = 'DELETE FROM data WHERE totalinc = ?'
self.run_query (query, (Income, ))
self.message['text'] = 'Record [] deleted.'.format(Income)
self.viewing_records()
def editing (self):
self.message['text'] = ''
try:
self.tree.item (self.tree.selection ())['values'][0]
except IndexError as e:
self.message['text'] = 'Please select record'
return
name = self.tree.item (self.tree.selection ())['text']
old_out = self.tree.item (self.tree.selection ())['values'][0]
self.edit_wind = Toplevel ()
self.edit_wind.title ("Editing")
Label (self.edit_wind, text = 'Old income:').grid (row = 0, column = 1)
Entry (self.edit_wind, textvariable = StringVar(self.edit_wind, value = name), state = 'readonly').grid(row = 0, column = 2)
Label (self.edit_wind, text = 'New income:').grid(row = 1, column = 1)
new_inc = Entry (self.edit_wind)
new_inc.grid (row = 1, column = 2)
Label (self.edit_wind, text = 'Old outgoing:').grid (row = 2, column = 1)
Entry (self.edit_wind, textvariable = StringVar(self.edit_wind, value = old_out), state = 'readonly').grid(row = 2, column = 2)
Label (self.edit_wind, text = 'New outgoing: ').grid(row = 3, column = 1)
new_out = Entry (self.edit_wind)
new_out.grid (row = 3, column = 2)
Button (self.edit_wind, text = 'Save changes', command = lambda: self.edit_records (new_inc.get(), name, new_out.get(), old_out)).grid (row = 4, column = 2, sticky = W)
self.edit_wind.mainloop()
def edit_records (self, new_inc, name, new_out, old_out):
query = "UPDATE data SET totalinc = ?, totalout = ? WHERE totalinc = ? AND totalout = ?"
parameters = (new_inc, new_out, name, old_out)
self.run_query (query, parameters)
self.edit_wind.destroy()
self.message['text'] = 'Record [] changed.' .format (name)
self.viewing_records()
if __name__ == '__main__':
wind = Tk()
application = Product (wind)
wind.mainloop()
str = '8'
if str.isdigit():
print(str)
I suggest taking a look at is isdigit().

update_cells not working as expected

I have written this function:
def duplicate_sheet1(wb, title=None):
if title is None:
title = wb.sheet1.title + ' DUPLICATE'
wb._sheet_list = [wb.sheet1]
wb.add_worksheet(title, wb.sheet1.row_count, wb.sheet1.col_count)
wb._sheet_list = wb._sheet_list[::-1]
wb._sheet_list[0].update_cells(wb._sheet_list[1]._fetch_cells())
...everything works as expected upon inspection with a debugger except update_cells, when I _fetch_cells for worksheet 0 after running the code, the sheet is empty.
Apparently the list returned by _fetch_cells is not the same as what is expected by update_cells. This may be because _fetch_cells does not include empty cells in the returned list, update_cells may only work with a 1 or 2-D grid--I am unsure.
Here is the work-around I found, apologies as the code could could probably be improved:
def duplicate_sheet1(wb, title=None):
if title is None:
title = wb.sheet1.title + ' DUPLICATE'
wb._sheet_list = [wb.sheet1]
wb.add_worksheet(title, wb.sheet1.row_count, wb.sheet1.col_count)
wb._sheet_list = wb._sheet_list[::-1]
cell_list = build_cell_list(wb._sheet_list[0], wb._sheet_list[1])
wb._sheet_list[0].update_cells(cell_list)
def build_cell_list(new_worksheet, old_worksheet):
fetched = old_worksheet._fetch_cells()
max_row = fetched[-1].row
max_col = max([cell.col for cell in fetched])
cell_list = new_worksheet.range('A1:' + chr(max_col + 64) + str(max_row))
for cell in cell_list:
cell.value = next(
(
f.value for f in fetched
if f.col == cell.col and f.row == cell.row
),
'',
)
return cell_list

[Yii][SQL] Do not getting records when join tables to get count of offers

I have following code to can sort by offers(bids and buys):
$clean_criteria = new CDbCriteria;
$clean_criteria->order = 'promoted DESC';
$criteria = new CDbCriteria;
$criteria->order = 'promoted DESC';
$criteria->select = "*, (COUNT(aa.id) + COUNT(ab.id)) as offers";
$criteria->join = ' LEFT JOIN auction_bid aa ON (t.id = aa.auction)';
$criteria->join .= ' LEFT JOIN auction_buy ab ON (t.id = ab.auction)';
I am using pagination and sort:
$item_count = Auction::model()->count($clean_criteria);
$pages = new CPagination($item_count);
$pages->setPageSize(5);
$pages->applyLimit($clean_criteria);
$sort = new CSort('Auction');
$sort->attributes = array('offers'=>array('asc'=>'offers', 'desc'=>'offers DESC'));
$sort->applyOrder($clean_criteria);
But to get offers i must use $crtieria when finding auctions:
$auctions = Auction::model()->findAll($criteria);
On site there is correct pages count, but on every there is one same auction. The problem is query return only one record, but how to fix this?
P.s Maybe i used wrong type of join and auctions which has not offers are not returned?
EDIT:
This code working properly:
$criteria->select = "*, COUNT(abid.id) AS offers1";
$criteria->join = ' LEFT JOIN auction_bid abid ON (t.id = abid.auction)';
$criteria->group = 't.id';
It returns offers1 = 1 for first auction.
Following too :
$criteria->select = "*, COUNT(abuy.id) AS offers2";
$criteria->join = ' LEFT JOIN auction_buy abuy ON (t.id = abuy.auction)';
$criteria->group = 't.id';
It returns offers2 = 4 for first auction
But when i combine it into one query:
$criteria->select = "*, COUNT(abid.id) AS offers1, COUNT(abuy.id) AS offers2";
$criteria->join = ' LEFT JOIN auction_bid abid ON (t.id = abid.auction)';
$criteria->join .= ' LEFT JOIN auction_buy abuy ON (t.id = abuy.auction)';
$criteria->group = 't.id';
It returns offers1 = 4 and offers2 = 4 for first auction.
How to fix it?
Solved:
criteria:
$criteria->select = "*, (COUNT(DISTINCT abid.id) + COUNT(DISTINCT abuy.quantity)) AS `offers`";
$criteria->join = ' LEFT JOIN auction_bid abid ON (t.id = abid.auction)';
$criteria->join .= ' LEFT JOIN auction_buy abuy ON (t.id = abuy.auction)';
$criteria->group = 't.id';
sorting:
$sort = new CSort('Auction');
$sort->attributes = array('offers');
$sort->applyOrder($criteria);
pagination:
$item_count = Auction::model()->count($clean_criteria);
$pages = new CPagination($item_count);
$pages->applyLimit($criteria);
searching:
$auctions = Auction::model()->findAll($criteria);

Lua: how use all tables in table

positions = {
--table 1
[1] = {pos = {fromPosition = {x=1809, y=317, z=8},toPosition = {x=1818, y=331, z=8}}, m = {"100 monster"}},
--table 2
[2] = {pos = {fromPosition = {x=1809, y=317, z=8},toPosition = {x=1818, y=331, z=8}}, m = {"100 monster"}},
-- table3
[3] = {pos = {fromPosition = {x=1809, y=317, z=8},toPosition = {x=1818, y=331, z=8}}, m = {"100 monster"}}
}
tb = positions[?]--what need place here?
for _,x in pairs(tb.m) do --function
for s = 1, tonumber(x:match("%d+")) do
pos = {x = math.random(tb.pos.fromPosition.x, tb.pos.toPosition.x), y = math.random(tb.pos.fromPosition.y, tb1.pos.toPosition.y), z = tb.pos.fromPosition.z}
doCreateMonster(x:match("%s(.+)"), pos)
end
end
Here the problem, i use tb = positions[1], and it only for one table in "positions" table. But how apply this function for all tables in this table?
I don't know Lua very well but you could loop over the table:
for i = 0, table.getn(positions), 1 do
tb = positions[i]
...
end
Sources :
http://lua.gts-stolberg.de/en/schleifen.php and http://www.lua.org/pil/19.1.html
You need to iterate over positions with a numerical for.
Note that, unlike Antoine Lassauzay's answer, the loop starts at 1 and not 0, and uses the # operator instead of table.getn (deprecated function in Lua 5.1, removed in Lua 5.2).
for i=1,#positions do
tb = positions[i]
...
end
use the pairs() built-in. there isn't any reason to do a numeric for loop here.
for index, position in pairs(positions) do
tb = positions[index]
-- tb is now exactly the same value as variable 'position'
end

Resources