python Use Pool to create multiple processes but not execute the results - multiprocessing

I put all the functions are placed in a class, including the creation of the process of the function and the implementation of the function, in another file to call the function of this class
from multiprocessing import Pool
def initData(self, type):
# create six process to deal with the data
if type == 'train':
data = pd.read_csv('./data/train_merged_8.csv')
elif type == 'test':
data = pd.read_csv('./data/test_merged_2.csv')
modelvec = allWord2Vec('no').getModel()
modelvec_all = allWord2Vec('all').getModel()
modelvec_stop = allWord2Vec('stop').getModel()
p = Pool(6)
count = 0
for i in data.index:
count += 1
p.apply_async(self.valueCal, args=(i, data, modelvec, modelvec_all, modelvec_stop))
if count % 1000 == 0:
print(str(count // 100) + 'h rows of data has been dealed')
p.close()
p.join
def valueCal(self, i, data, modelvec, modelvec_all, modelvec_stop):
# the function run in process
list_con = []
q1 = str(data.get_value(i, 'question1')).split()
q2 = str(data.get_value(i, 'question2')).split()
f1 = self.getF1_union(q1, q2)
f2 = self.getF2_inter(q1, q2)
f3 = self.getF3_sum(q1, q2)
f4_q1 = len(q1)
f4_q2 = len(q2)
f4_rate = f4_q1/f4_q2
q1 = [','.join(str(ve)) for ve in q1]
q2 = [','.join(str(ve)) for ve in q2]
list_con.append('|'.join(q1))
list_con.append('|'.join(q2))
list_con.append(f1)
list_con.append(f2)
list_con.append(f3)
list_con.append(f4_q1)
list_con.append(f4_q2)
list_con.append(f4_rate)
f = open('./data/test.txt', 'a')
f.write('\t'.join(list_con) + '\n')
f.close()
The result appears very soon like this, but I have not even seen the file being created.But when I check the task manager, there are indeed six processes are created and consumed a lot of resources I cpu. And when the program is finished, the file is still not created.
How can i solve this problem?
10h rows of data have been dealed
20h rows of data have been dealed
30h rows of data have been dealed
40h rows of data have been dealed

Related

Improve code result speed by multiprocessing

I'm self study of Python and it's my first code.
I'm working for analyze logs from the servers. Usually I need analyze full day logs. I created script (this is example, simple logic) just for check speed. If I use normal coding the duration of analyzing 20mil rows about 12-13 minutes. I need 200mil rows by 5 min.
What I tried:
Use multiprocessing (met issue with share memory, think that fix it). But as the result - 300K rows = 20 sec and no matter how many processes. (PS: Also need control processors count in advance)
Use threading (I found that it's not give any speed, 300K rows = 2 sec. But normal code same, 300K = 2 sec)
Use asyncio (I think that script is slow because need reads many files). Result same as threading - 300K = 2 sec.
Finally I think that all three my script incorrect and didn't work correctly.
PS: I try to avoid use specific python modules (like pandas) because in this case it will be more difficult to execute on different servers. Better to use common lib.
Please help to check 1st - multiprocessing.
import csv
import os
from multiprocessing import Process, Queue, Value, Manager
file = {"hcs.log", "hcs1.log", "hcs2.log", "hcs3.log"}
def argument(m, a, n):
proc_num = os.getpid()
a_temp_m = a["vod_miss"]
a_temp_h = a["vod_hit"]
with open(os.getcwd() + '/' + m, newline='') as hcs_1:
hcs_2 = csv.reader(hcs_1, delimiter=' ')
for j in hcs_2:
if j[3].find('MISS') != -1:
a_temp_m[n] = a_temp_m[n] + 1
elif j[3].find('HIT') != -1:
a_temp_h[n] = a_temp_h[n] + 1
a["vod_miss"][n] = a_temp_m[n]
a["vod_hit"][n] = a_temp_h[n]
if __name__ == '__main__':
procs = []
manager = Manager()
vod_live_cuts = manager.dict()
i = "vod_hit"
ii = "vod_miss"
cpu = 1
n = 1
vod_live_cuts[i] = manager.list([0] * cpu)
vod_live_cuts[ii] = manager.list([0] * cpu)
for m in file:
proc = Process(target=argument, args=(m, vod_live_cuts, (n-1)))
procs.append(proc)
proc.start()
if n >= cpu:
n = 1
proc.join()
else:
n += 1
[proc.join() for proc in procs]
[proc.close() for proc in procs]
I'm expect, each file by def argument will be processed by independent process and finally all results will be saved in dict vod_live_cuts. For each process I added independent list in dict. I think it will help cross operation for use this parameter. But maybe it's wrong way :(
using IPC is costly, so only use "shared objects" for saving the final result, not for intermediate results while parsing the file.
limiting the number of processes is done by using a multiprocessing.Pool, the following code uses it to reach the max hard-disk speed, you only need to post-process the results.
you can only parse data as fast as your HDD can read it (typically 30-80 MB/s), so if you need to improve the performance further you should use SSD or RAID0 for higher disk speed, you cannot get much faster than this without changing your hardware.
import csv
import os
from multiprocessing import Process, Queue, Value, Manager, Pool
file = {"hcs.log", "hcs1.log", "hcs2.log", "hcs3.log"}
def argument(m, a):
proc_num = os.getpid()
a_temp_m_n = 0 # make it local to process
a_temp_h_n = 0 # as shared lists use IPC
with open(os.getcwd() + '/' + m, newline='') as hcs_1:
hcs_2 = csv.reader(hcs_1, delimiter=' ')
for j in hcs_2:
if j[3].find('MISS') != -1:
a_temp_m_n = a_temp_m_n + 1
elif j[3].find('HIT') != -1:
a_temp_h_n = a_temp_h_n + 1
a["vod_miss"].append(a_temp_m_n)
a["vod_hit"].append(a_temp_h_n)
if __name__ == '__main__':
manager = Manager()
vod_live_cuts = manager.dict()
i = "vod_hit"
ii = "vod_miss"
cpu = 1
vod_live_cuts[i] = manager.list()
vod_live_cuts[ii] = manager.list()
with Pool(cpu) as pool:
tasks = []
for m in file:
task = pool.apply_async(argument, args=(m, vod_live_cuts))
tasks.append(task)
for task in tasks:
task.get()
print(list(vod_live_cuts[i]))
print(list(vod_live_cuts[ii]))

applyInPandas() aggregation runs slowly on big delta table

I'm trying to create a gold table notebook in Databricks, however it would take 9 days to fully reprocess the historical data (43GB, 35k parquet files). I tried scaling up the cluster but it doesn't go above 5000 records/second. The bottleneck seems to be the applyInPandas() function. I'm wondering if I could replace pandas with anything else to make the gold notebook execute faster.
Silver table has 60 columns (read_id, reader_id, tracker_timestamp, event_type, ebook_id, page_id, agent_ip, agent_device_type, ...). Each row of data represents read event of an ebook. E.g 'page turn', 'click on image', 'click on link',... All of the events that have occurred in the single session have the same read.id. In the gold table I'm trying to group those events in sessions and calculate the number of times each event has occurred in the single session. So instead of 100+ rows of data for a read session in silver table I would end up just with a single aggregated row in gold table.
Input is the silver delta table:
import pyspark.sql.functions as F
import pyspark.sql.types as T
import pandas as pd
from pyspark.sql.functions import pandas_udf
input = (spark
.readStream
.format("delta")
.option("withEventTimeOrder", "true")
.option("maxFilesPerTrigger", 100)
.load(f"path_to_silver_bucket")
)
I use withWatermark and session_window functions to ensure I end up grouping all of the events from the single read session. (read session automatically ends 30 minutes after the last reader activity)
group = input.withWatermark("tracker_timestamp", "10 minutes").groupBy("read_id", F.session_window(input.tracker_timestamp, "30 minutes"))
In the next step I use the applyInPandas function like so:
sessions = group.applyInPandas(processing_function, schema=processing_function_output_schema)
Definition of the processing_function used in applyInPandas:
def processing_function(df):
surf_time_ms = df.query('event_type == "surf"')['duration'].sum()
immerse_time_ms = df.query('event_type == "immersion"')['duration'].sum()
min_timestamp = df['tracker_timestamp'].min()
max_timestamp = df['tracker_timestamp'].max()
shares = len(df.query('event_type == "share"'))
leads = len(df.query('event_type == "lead_store"'))
is_read = len(df.query('event_type == "surf"')) > 0
distinct_pages = df['page_id'].nunique()
data = {
"read_id": df['read_id'].values[0],
"surf_time_ms": surf_time_ms,
"immerse_time_ms": immerse_time_ms,
"min_timestamp": min_timestamp,
"max_timestamp": max_timestamp,
"shares": shares,
"leads": leads,
"is_read": is_read,
"number_of_events": len(df),
"distinct_pages": distinct_pages
}
for field in not_calculated_string_fields:
data[field] = df[field].values[0]
new_df = pd.DataFrame(data=data, index=['read_id'])
for x in all_events:
new_df[f"count_{x}"] = df.query(f"type == '{x}'").count()
for x in duration_events:
duration = df.query(f"event_type == '{x}'")['duration']
duration_sum = duration.sum()
new_df[f"duration_{x}_ms"] = duration_sum
if duration_sum > 0:
new_df[f"mean_duration_{x}_ms"] = duration.mean()
else:
new_df[f"mean_duration_{x}_ms"] = 0
return new_df
And finally, I'm writing the calculated row to the gold table like so:
for_partitioning = (sessions
.withColumn("tenant", F.col("story_tenant"))
.withColumn("year", F.year(F.col("min_timestamp")))
.withColumn("month", F.month(F.col("min_timestamp"))))
checkpoint_path = "checkpoint-path"
gold_path = f"gold-bucket"
(for_partitioning
.writeStream
.format('delta')
.partitionBy('year', 'month', 'tenant')
.option("mergeSchema", "true")
.option("checkpointLocation", checkpoint_path)
.outputMode("append")
.start(gold_path))
Can anybody think of a more efficient way to do a UDF in PySpark than applyInPandas for the above example? I simply cannot afford to wait 9 days to reprocess 43GB of data...
I've tried playing around with different input and output options (e.g. .option("maxFilesPerTrigger", 100)) but the real problem seems to be applyInPandas.
You could rewrite your processing_function into native Spark if you really wanted.
"read_id": df['read_id'].values[0]
F.first('read_id').alias('read_id')
"surf_time_ms": df.query('event_type == "surf"')['duration'].sum()
F.sum(F.when(F.col('event_type') == 'surf', F.col('duration'))).alias('surf_time_ms')
"immerse_time_ms": df.query('event_type == "immersion"')['duration'].sum()
F.sum(F.when(F.col('event_type') == 'immersion', F.col('duration'))).alias('immerse_time_ms')
"min_timestamp": df['tracker_timestamp'].min()
F.min('tracker_timestamp').alias('min_timestamp')
"max_timestamp": df['tracker_timestamp'].max()
F.max('tracker_timestamp').alias('max_timestamp')
"shares": len(df.query('event_type == "share"'))
F.count(F.when(F.col('event_type') == 'share', F.lit(1))).alias('shares')
"leads": len(df.query('event_type == "lead_store"'))
F.count(F.when(F.col('event_type') == 'lead_store', F.lit(1))).alias('leads')
"is_read": len(df.query('event_type == "surf"')) > 0
(F.count(F.when(F.col('event_type') == 'surf', F.lit(1))) > 0).alias('is_read')
"number_of_events": len(df)
F.count(F.lit(1)).alias('number_of_events')
"distinct_pages": df['page_id'].nunique()
F.countDistinct('page_id').alias('distinct_pages')
for field in not_calculated_string_fields:
data[field] = df[field].values[0]
*[F.first(field).alias(field) for field in not_calculated_string_fields]
for x in all_events:
new_df[f"count_{x}"] = df.query(f"type == '{x}'").count()
The above can probably be skipped? As far as my tests go, new columns get NaN values, because .count() returns a Series object instead of one simple value.
for x in duration_events:
duration = df.query(f"event_type == '{x}'")['duration']
duration_sum = duration.sum()
new_df[f"duration_{x}_ms"] = duration_sum
if duration_sum > 0:
new_df[f"mean_duration_{x}_ms"] = duration.mean()
else:
new_df[f"mean_duration_{x}_ms"] = 0
*[F.sum(F.when(F.col('event_type') == x, F.col('duration'))).alias(f"duration_{x}_ms") for x in duration_events]
*[F.mean(F.when(F.col('event_type') == x, F.col('duration'))).alias(f"mean_duration_{x}_ms") for x in duration_events]
So, instead of
def processing_function(df):
...
...
sessions = group.applyInPandas(processing_function, schema=processing_function_output_schema)
you could use efficient native Spark:
sessions = group.agg(
F.first('read_id').alias('read_id'),
F.sum(F.when(F.col('event_type') == 'surf', F.col('duration'))).alias('surf_time_ms'),
F.sum(F.when(F.col('event_type') == 'immersion', F.col('duration'))).alias('immerse_time_ms'),
F.min('tracker_timestamp').alias('min_timestamp'),
F.max('tracker_timestamp').alias('max_timestamp'),
F.count(F.when(F.col('event_type') == 'share', F.lit(1))).alias('shares'),
F.count(F.when(F.col('event_type') == 'lead_store', F.lit(1))).alias('leads'),
(F.count(F.when(F.col('event_type') == 'surf', F.lit(1))) > 0).alias('is_read'),
F.count(F.lit(1)).alias('number_of_events'),
F.countDistinct('page_id').alias('distinct_pages'),
*[F.first(field).alias(field) for field in not_calculated_string_fields],
# skipped count_{x}
*[F.sum(F.when(F.col('event_type') == x, F.col('duration'))).alias(f"duration_{x}_ms") for x in duration_events],
*[F.mean(F.when(F.col('event_type') == x, F.col('duration'))).alias(f"mean_duration_{x}_ms") for x in duration_events],
)

R estimating one independent variable more than once

I am trying to estimate a multinomial logit model for predicting systemic banking crisis with panel data. Below is my code. I have ran this code before and it has worked fine. However, I tried to change the names of the independent variables and used the new data to run the model again. But ever since then R is estimating multiple iterations of x1 variable. But when I am dropping x1 the model estimation turns out to be just fine again. I have attached a screenshots of the results. Faulty_result1, Faulty_result_2 and Result_with_x1_dropped. I can't seem to figure out what the issue is. Any help will be much appreciated.
#Remove all items from memory (if any)
rm(list=ls(all=TRUE))
#Set working directory to load files
setwd("D:/PhD/Codes")
#Load necessary libraries
library(readr)
library(nnet)
library(plm)
#Load data
my_data <- read_csv("D:/PhD/Data/xx_Final Data_4.csv",
col_types = cols(`Time Period` = col_date(format = "%d/%m/%Y"),
y = col_factor(levels = c("0", "1",
"2")), x2 = col_double(), x5 = col_double(),
x9 = col_double(), x11 = col_double(),
x13 = col_double(), x24 = col_double()),
na = "NA")
#Change levels from numeric to character
levels(my_data$y) <- c("Tranquil", "Pre-crisis", "Crisis")
str(my_data$y)
#Create Panel Data
p_data=pdata.frame(my_data)
#Export dataset
write_csv(p_data,"D:/PhD/Data/Clean_Final Data_4.csv")
#Drop unnecessary columns
p <- subset(p_data, select = c(3:27))
#Set reference level
p$y <- relevel(p$y, ref="Tranquil")
#Create Model
model <- multinom(y~ ., data = p)
summary(model)
stargazer::stargazer(model, type = "text")

Extract multiple protein sequences from a Protein Data Bank along with Secondary Structure

I want to extract protein sequences and their corresponding secondary structure from any Protein Data bank, say RCSB. I just need short sequences and their secondary structure. Something like,
ATRWGUVT Helix
It is fine even if the sequences are long, but I want a tag at the end that denotes its secondary structure. Is there any programming tool or anything available for this.
As I've shown above I want only this much minimal information. How can I achieve this?
from Bio.PDB import *
from distutils import spawn
Extract sequence:
def get_seq(pdbfile):
p = PDBParser(PERMISSIVE=0)
structure = p.get_structure('test', pdbfile)
ppb = PPBuilder()
seq = ''
for pp in ppb.build_peptides(structure):
seq += pp.get_sequence()
return seq
Extract secondary structure with DSSP as explained earlier:
def get_secondary_struc(pdbfile):
# get secondary structure info for whole pdb.
if not spawn.find_executable("dssp"):
sys.stderr.write('dssp executable needs to be in folder')
sys.exit(1)
p = PDBParser(PERMISSIVE=0)
ppb = PPBuilder()
structure = p.get_structure('test', pdbfile)
model = structure[0]
dssp = DSSP(model, pdbfile)
count = 0
sec = ''
for residue in model.get_residues():
count = count + 1
# print residue,count
a_key = list(dssp.keys())[count - 1]
sec += dssp[a_key][2]
print sec
return sec
This should print both sequence and secondary structure.
You can use DSSP.
The output of DSSP is explained extensively under 'explanation'. The very short summary of the output is:
H = α-helix
B = residue in isolated β-bridge
E = extended strand, participates in β ladder
G = 3-helix (310 helix)
I = 5 helix (π-helix)
T = hydrogen bonded turn
S = bend

Join tiles in Corona SDK into one word for a Breakout game grid?

I have a game project to re-implement Breakout. I want to display two words, each word on a line. They are joined by the bricks block. Inside, the top line is the first name, aligned left. The bottom line is the last name, aligned right. They are input from textboxes, and rendered as shown:
Each second that passes, the screen will add a configurable number of bricks to the grid (for example, five bricks per second) until the two words appear complete. I displayed a letter of the alphabet which is created from the matrix(0,1).
...But I don’t know how to join them into one word. How can I join these letters?
This is what I've gotten so far:
Bricks.lua
local Bricks = display.newGroup() -- static object
local Events = require("Events")
local Levels = require("Levels")
local sound = require("Sound")
local physics = require("physics")
local Sprites = require("Sprites")
local Func = require("Func")
local brickSpriteData =
{
{
name = "brick",
frames = {Sprites.brick}
},
{
name = "brick2",
frames = {Sprites.brick2}
},
{
name = "brick3",
frames = {Sprites.brick3}
},
}
-- animation table
local brickAnimations = {}
Sprites:CreateAnimationTable
{
spriteData = brickSpriteData,
animationTable = brickAnimations
}
-- get size from temp object for later use
local tempBrick = display.newImage('red_apple_20.png',300,500)
--local tempBrick = display.newImage('cheryGreen2.png',300,500)
local brickSize =
{
width = tempBrick.width,
height = tempBrick.height
}
--tempBrick:removeSelf( )
----------------
-- Rubble -- needs to be moved to its own file
----------------
local rubbleSpriteData =
{
{
name = "rubble1",
frames = {Sprites.rubble1}
},
{
name = "rubble2",
frames = {Sprites.rubble2}
},
{
name = "rubble3",
frames = {Sprites.rubble3}
},
{
name = "rubble4",
frames = {Sprites.rubble4}
},
{
name = "rubble5",
frames = {Sprites.rubble5}
},
}
local rubbleAnimations = {}
Sprites:CreateAnimationTable
{
spriteData = rubbleSpriteData,
animationTable = rubbleAnimations
}
local totalBricksBroken = 0 -- used to track when level is complete
local totalBricksAtStart = 0
-- contains all brick objects
local bricks = {}
local function CreateBrick(data)
-- random brick sprite
local obj = display.newImage('red_apple_20.png')
local objGreen = display.newImage('cheryGreen2.png')
obj.name = "brick"
obj.x = data.x --or display.contentCenterX
obj.y = data.y --or 1000
obj.brickType = data.brickType or 1
obj.index = data.index
function obj:Break()
totalBricksBroken = totalBricksBroken + 1
bricks[self.index] = nil
obj:removeSelf( )
sound.play(sound.breakBrick)
end
function obj:Update()
if(self == nil) then
return
end
if(self.y > display.contentHeight - 20) then
obj:Break()
end
end
if(obj.brickType ==1) then
physics.addBody( obj, "static", {friction=0.5, bounce=0.5 } )
elseif(obj.brickType == 2) then
physics.addBody( objGreen,"static",{friction=0.2, bounce=0.5, density = 1 } )
end
return obj
end
local currentLevel = testLevel
-- create level from bricks defined in an object
-- this allows for levels to be designed
local function CreateBricksFromTable(level)
totalBricksAtStart = 0
local activeBricksCount = 0
for yi=1, #level.bricks do
for xi=1, #level.bricks[yi] do
-- create brick?
if(level.bricks[yi][xi] > 0) then
local xPos
local yPos
if(level.align == "center") then
--1100-((99*16)*0.5)
xPos = display.contentCenterX- ((level.columns * brickSize.width) * 0.5/3) + ((xi-1) * level.xSpace)--display.contentCenterX
--xPos = 300 +(xi * level.xSpace)
yPos = 100 + (yi * level.ySpace)--100
else
xPos = level.xStart + (xi * level.xSpace)
yPos = level.yStart + (yi * level.ySpace)
end
local brickData =
{
x = xPos,
y = yPos,
brickType = level.bricks[yi][xi],
index = activeBricksCount+1
}
bricks[activeBricksCount+1] = CreateBrick(brickData)
activeBricksCount = activeBricksCount + 1
end
end
end
totalBricks = activeBricksCount
totalBricksAtStart = activeBricksCount
end
-- create bricks for level --> set from above functions, change function to change brick build type
local CreateAllBricks = CreateBricksFromTable
-- called by a timer so I can pass arguments to CreateAllBricks
local function CreateAllBricksTimerCall()
CreateAllBricks(Levels.currentLevel)
end
-- remove all brick objects from memory
local function ClearBricks()
for i=1, #bricks do
bricks[i] = nil
end
end
-- stuff run on enterFrame event
function Bricks:Update()
-- update individual bricks
if(totalBricksAtStart > 0) then
for i=1, totalBricksAtStart do
-- brick exists?
if(bricks[i]) then
bricks[i]:Update()
end
end
end
-- is level over?
if(totalBricksBroken == totalBricks) then
Events.allBricksBroken:Dispatch()
end
end
----------------
-- Events
----------------
function Bricks:allBricksBroken(event)
-- cleanup bricks
ClearBricks()
local t = timer.performWithDelay( 1000, CreateAllBricksTimerCall)
--CreateAllBricks()
totalBricksBroken = 0
-- play happy sound for player to enjoy
sound.play(sound.win)
print("You Win!")
end
Events.allBricksBroken:AddObject(Bricks)
CreateAllBricks(Levels.currentLevel)
return Bricks
Levels.lua
local Events = require("Events")
local Levels = {}
local function MakeLevel(data)
local level = {}
level.xStart = data.xStart or 100
level.yStart = data.yStart or 100
level.xSpace = data.xSpace or 23
level.ySpace = data.ySpace or 23
level.align = data.align or "center"
level.columns = data.columns or #data.bricks[1]
level.bricks = data.bricks --> required
return level
end
Levels.test4 = MakeLevel
{
bricks =
{
{0,2,0,0,2,0,0,2,0},
{0,0,2,0,2,0,2,0,0},
{0,0,0,0,2,0,0,0,0},
{1,1,2,1,1,1,2,1,1},
{0,0,0,0,1,0,0,0,0},
{0,0,0,0,1,0,0,0,0},
{0,0,0,0,1,0,0,0,0},
}
}
Levels.test5 = MakeLevel
{
bricks =
{
{0,0,0,1,0,0,0,0},
{0,0,1,0,1,0,0,0},
{0,0,1,0,1,0,0,0},
{0,1,0,0,0,1,0,0},
{0,1,1,1,1,1,0,0},
{1,0,0,0,0,0,1,0},
{1,0,0,0,0,0,1,0},
{1,0,0,0,0,0,1,0},
{1,0,0,0,0,0,1,0}
}
}
-- Levels.test6 = MakeLevel2
-- {
-- bricks =
-- {
----A "a" = {{0,0,0,1,0,0,0,0},
-- {0,0,1,0,1,0,0,0},
-- {0,0,1,0,1,0,0,0},
-- {0,1,0,0,0,1,0,0},
-- {0,1,1,1,1,1,0,0},
-- {1,0,0,0,0,0,1,0},
-- {1,0,0,0,0,0,1,0},
-- {1,0,0,0,0,0,1,0},
-- {1,0,0,0,0,0,1,0}},
----B
-- "b" = {{1,1,1,1,0,0,0},
-- {1,0,0,0,1,0,0},
-- {1,0,0,0,1,0,0},
-- {1,0,0,0,1,0,0},
-- {1,1,1,1,0,0,0},
-- {1,0,0,0,1,0,0},
-- {1,0,0,0,0,1,0},
-- {1,0,0,0,0,1,0},
-- {1,1,1,1,1,0,0}},
--...........
--.......
--...
-- --Z
-- "z"= {{1,1,1,1,1,1,1,0},
-- {0,0,0,0,0,1,0,0},
-- {0,0,0,0,1,0,0,0},
-- {0,0,0,0,1,0,0,0},
-- {0,0,0,1,0,0,0,0},
-- {0,0,1,0,0,0,0,0},
-- {0,0,1,0,0,0,0,0},
-- {0,1,0,0,0,0,0,0},
-- {1,1,1,1,1,1,1,0}}
-- }
-- }
-- stores all levels in ordered table so that one can be selected randomly by index
Levels.levels =
{
--Levels.test4,
Levels.test5
-- Levels.test6,
}
function Levels:GetRandomLevel()
return self.levels[math.random(#Levels.levels)]
end
Levels.notPlayedYet = {}
Levels.currentLevel = Levels:GetRandomLevel()
-- Events
function Levels:allBricksBroken(event)
self.currentLevel = Levels:GetRandomLevel()
end
Events.allBricksBroken:AddObject(Levels)
return Levels
The work I've done thus far (same as above) as an external download: http://www.mediafire.com/download/1t89ftkbznkn184/Breakout2.rar
In the interest of actually answering the question:
I'm not 100% sure what you mean by "How can I join these letters", but from poking through the code I have a guess, so please clarify on whether it is accurate, or if I am wrong about what you wanted.
Scenario 1
You haven't successfully achieved the image illustrated in the screenshot - you've been able to draw one letter, but not multiple ones.
In this case, you'll need to have a better understanding of what your code is doing. The CreateBricksFromTable function takes in a Level object, which is created by the MakeLevel function from a table with a bricks property, which is a table of tables that represent rows with columns in them, showing what type of brick should be at each position. In your commented-out level, you have created an table where the bricks field contains a field for each letter, but the MakeLevel function still expects a bricks field that directly contains the grid of blocks. You will have to - as it seems you attempted - create a MakeWordLevel function (or the like) that takes this letter list, and a word for each line, and constructs a larger grid by copying the appropriate letters into it.
StackOverflow is not your programming tutor, and an SO question is not the right forum for having people write code for you or getting into step-by-step details of how to do this, but I'll leave you a basic outline. Your function would look something like this:
local function MakeWordLevel(data, line1, line2)
local level = {}
...
return level
end
And then would have to:
Populate all of the same properties that MakeLevel does
Calculate how wide (level.columns) the level should be with all the letters
Create a table in the same format as the bricks properties, but big enough to hold all of the letters
Go through the input strings (line1 and line2), find the correct letter data from what is now the test6 array, and copy that data into the large table
Assign that table as level.bricks
This question already is a bit outside of what StackOverflow is intended for in that it asks about how to implement a feature rather than achieve a small, specific programming task, so any further followup should take place in a chatroom - perhaps the Hello World room would be helpful.
Scenario 2:
This was my original guess, but after considering and reading past edits, I doubt this is answering the right question
You may want a solid "background" of, say, red blocks, surrounding your letters and making the field into a solid "wall", with the name in a different color. And you may want these bricks to slowly show up a few at a time.
In that case, the main thing you need to do is keep track of what spaces are "taken" by the name bricks. There are many ways to do this, but I would start with a matrix to keep track of that - as big as the final playing field - full of 0's. Then, as you add the bricks for the name, set a 1 at the x,y location in that matrix according to that block's coordinate.
When you want to fill in the background, each time you go to add a block at a coordinate, check that "taken" matrix before trying to add a block - if it's taken (1), then just skip it and move onto the next coordinate.
This works if you're filling in the background blocks sequentially (say, left to right, top to bottom), or if you want to add them randomly. With random, you'd also want to keep updating the "taken" matrix so you don't try to add a block twice.
The random fill-in, however, presents its own problem - it will keep taking longer to fill in as it goes, because it'll find more and more "taken" blocks and have to pick a new one. There are solutions to this, of course, but I won't go too far down that road when I don't know if that's even what you want.
I don't really understand (or read, for that matter) your code but from what I see joining them into complete words is easy. You have two possibilities.
You can "render" them directly into your level/display data, simply copy them to the appropriate places, like this:
-- The level data.
local level = {}
-- Create the level data.
for row = 1, 25, 1 do
local rowData = {}
for column = 1, 80, 1 do
rowData[column] = "."
end
level[row] = rowData
end
-- Now let us setup the letters.
local letters = {
A = {
{".",".",".","#",".",".",".","."},
{".",".","#",".","#",".",".","."},
{".",".","#",".","#",".",".","."},
{".","#",".",".",".","#",".","."},
{".","#","#","#","#","#",".","."},
{"#",".",".",".",".",".","#","."},
{"#",".",".",".",".",".","#","."},
{"#",".",".",".",".",".","#","."},
{"#",".",".",".",".",".","#","."}
},
B = {
{"#","#","#","#",".",".","."},
{"#",".",".",".","#",".","."},
{"#",".",".",".","#",".","."},
{"#",".",".",".","#",".","."},
{"#","#","#","#",".",".","."},
{"#",".",".",".","#",".","."},
{"#",".",".",".",".","#","."},
{"#",".",".",".",".","#","."},
{"#","#","#","#","#",".","."}
}
}
-- The string to print.
local text = "ABBA"
-- Let us insert the data into the level data.
for index = 1, #text, 1 do
local char = string.sub(text, index, index)
local charData = letters[char]
local offset = index * 7
for row = 1, 9, 1 do
local rowData = charData[row]
for column = 1, 7, 1 do
level[row][offset + column] = rowData[column]
end
end
end
-- Print everything
for row = 1, 25, 1 do
local rowData = level[row]
for column = 1, 80, 1 do
io.write(rowData[column])
end
print()
end
You save you letters in a lookup table and then copy them, piece by piece, to the level data. Here I replaced the numbers with dots and number signs to make it prettier on the command line.
Alternately to that you can also "render" the words into a prepared buffer and then insert that into the level data by using the same logic.

Resources