How is this pregmatch working? - preg-match

I'm looking at some code that has
preg_match('/\[youtube ([[:print:]]+)\]/', $content, $matches)
$content could be a link such as *http://www.youtube.com/watch?v=some_video*
I can see it's filtering for youtube video, but I don't get how it's doing it. More specifically, what's the role of [:print:]?

Here's an experiment in box drawing.
/\[youtube ([[:print:]]+)\]/
│ │ │ │ │
│ │ │ │ └─ close the matched string
│ │ │ └──────────── start the character class
│ │ └────────────── open the matched string
│ └─────────────────────── literal square bracket
└───────────────────────── start the regexp
The important bit is the part inside parentheses. That gets matched by your programming language for re-use as a variable, so that you can construct your replacement URL.

Your answer is here: http://www.php.net/manual/en/function.preg-match-all.php#81559
"[:print:] - printing characters, including space"

Related

Loop over CSV rows in jekyll

I'm encountering some issues with loading data from a csv file with jekyll.
I've made some custom collections using the following _config.yml seup:
collections_dir: collections
collections:
people:
output: true
publications:
output: true
Now I this is what my structure looks like (just showing the relevant parts):
.
├── collections
│   ├── _people
│   │   ├── x.md
│   │   ├── y.md
│   └── _publications
│       └── data.csv
├── index.html
└── _config.yml
When I try to loop over my collections using this link as a guide it doesn't seem to work
# except from my index.html file
{% for row in site.publications.data %}
<p>name: {{row.name}}</p>
{% endfor %}
I think you are after a data file, rather than a collection here. The main differences are:
Data files can have each item as entries in a single file, or separate files in a subfolder of _data
Collections have each item as a separate file only
Data entries do not output a page per entry
Collections can output a page per entry
Here's how you would change this to data:
Move collections/publications/data.csv to _data/publications.csv
Remove the publications entry from collections in _config.yml
Change your loop to the following:
{% for row in site.data.publications %}
<p>name: {{ row.name }}</p>
{% endfor %}
If you want to use a data file and output a page per entry, a popular plugin to do so is https://github.com/avillafiorita/jekyll-datapage_gen
Alternatively, you could split the CSV into separate Markdown files and use a collection to avoid adding a plugin.

using ../ at go:embed annotation

I want to embed a file placed one level above the golang file code.
for example:
dir1
file.go
dir2
file.txt
How to embed file.txt inside file.go using go:embed?
The documentation states:
Patterns may not contain ‘.’ or ‘..’ or empty path elements, nor may they begin or end with a slash.
So what you are trying to do is not supported directly. Further information is available in the comments on this issue.
One thing you can do is to put a go file in dir2, embed file.txt in that and then import/use that in dir1/file.go (assuming the folders are in the same package).
This is not supported in the embed package as stated by #Brits (https://pkg.go.dev/embed)
A pattern I like to use is to create an resources.go file in my project's internal package and put all my embedded resources in there eg:
├── cmd\
│ └── cool.go
└── internal\
└── resources\
├── resources.go
├── fonts\
│ └── coolfont.ttf
└── icons\
└── coolicon.ico
resources.go
import _ "embed"
//go:embed fonts/coolfont.fs
var fonts byte[] // embed single file
//go:embed icons/*
var icons embed.FS // embed whole directory
There are libraries that can help with this as well such as those listed here https://github.com/avelino/awesome-go#resource-embedding
But I've not run into a use case where plain old embed wasn't enough for my needs.

bash: how to list a leaf directory in the unknown path?

I have multiple directories that in turn contain subdirectories. Example:
company_a/raw/2020/12
The value of the first directory (company_a in the sample above) is variable, but always with a pattern "word_letter"
The value of the second directory raw is immutable
The values of the last two directories (/2020/12 in the sample above) are variable.
My purpose is to extract the size of each leaf subdirectory (given the sample path above, the leaf subdir would be 12/) using a for loop.
Is there some kind of reverse basename utility which would allow me to list the entire path, using company_x/ dir as the root dir? Because if I want to extract directories' size, first I need to figure out how to list the last directories in the path.
A sample tree for reference:
$ tree company_b
tree company_b
└── raw
└── 2020
├── 05
│   └── data.raw
├── 06
│   └── data.raw
├── 07
│   └── data.raw
└── 08
└── data.raw
6 directories, 4 files
The du command does this very well using wildcards.
du -h */raw/*/*
Output:
80K company_b/raw/2021/02
80K company_b/raw/2021/05
80K company_b/raw/2021/04
80K company_b/raw/2021/01
80K company_b/raw/2021/03

Bash (Mac) - Moving Randomized Files to a Multiple Other Folders with a "Fill Limit"

I have a couple of hundred files in one folder, and I'd like to randomly move them to a number of different folders with a bash script - however, I'd like to fill each of those destination folders only up to a given capacity.
I'm thinking the right way to approach this is to assign two arrays, one containing all destination folders and one containing all files. Then I can randomly take a file from the filesarr and place it in a destination folder. My question is, how can I limit the number of files placed in each destination folder? So say I'm looking for ten files per destination folder - how can I move the first ten files from filesarr to the first folder in foldersarr, then move the next ten to the second folder in foldersarr, until all files have been moved? I know I should probably use a counter here, but my current attempt (below) is not doing the trick.
filesarr=(/Path/to/files/*) # this is the array of files to shuffle
foldersarr=(/Path/to/destination/folders/) # array of folders to move into
foldercount=0 # set it to 0
for afolder in "${foldersarr[#]}"; do
if [[ "$foldercount" -gt 10 ]]; then
echo "$foldercount files in folder, exiting and moving to next folder"
exit 1
else
for afile in "${filesarr[#]}"; do # do loop length(array) times; once for each file
length=${#filesarr[#]}
randomi=$(( $RANDOM % $length )) # select a random index
filename=${filesarr[$randomi]}
mv ${filename} ${foldersarr[#]}
echo "moving '$filename'"
foldercount=$((foldercount+1))
unset -v "filesarr[$randomi]" # unset after moved
array=("${filesarr[#]}") # remove NULL elements introduced by unset; copy array
done
fi
done
My current directory structure consists of all the files in a "holding" directory, and all the destination folders where I'd like to move them in a separate folder.
rootfolder
│
├── holding
│ ├── dywd.pdf
│ ├── ... (approx. 200 files)
│ └── kjfwekfjnwe.pdf
│
└── destinations
├── folder01
├── ...
└── folder10
I'd like to end up with this:
rootfolder
│
├── holding
│
└── destinations
├── folder01
│ ├── lwkejdwe.pdf
│ ├── ...
│ └── (ten files in this folder)
├── ...
│
└── folderXX
├── qwuoe.pdf
├── ...
└── (ten files in this folder)
something like this, (not tested)
dirs=(..) # array of dirs
dir_length=${#dirs[#]}
find -maxdepth 1 -type f | # or any other list of files
shuf |
while c=0 IFS= -r file;
do mv "$file" "{dirs[c++%$dir_length]}";
done
this will round robin moving files to target directories. The randomness is generated with shuf, no need to maintain the list of files separately.
You could create some "buckets" variables and fill each one with the same amount of file names, eg divide all your files in scope into these buckets. Then when done, write each bucket into a separate folder

bash script copy multiple folders and files logic

Bash Script:
Each file inside table directories will need be renamed from keyspace to newkyespace_456 when it is copied to destination.
└── Main_folder
├── keyspace
│   ├── tableA-12323/keyspace-tableA-12323-ka-1-Data.db
│   ├── tableB-123425/keyspace-tableA-123425-ka-1-Data.db
│   └── tableC-12342/keyspace-tableA-12342-ka-1-Data.db
└── newkeyspace_456 ( given folder) and sub folders
├── tableA-12523
├── tableB-173425
└── tableC-1242
Example is
keyspace/tableA-12323/keyspace-tableA-12323-ka-1-Data.db
to
newkeyspace_456/tableA-12523/newkeyspace_456-tableA-12523-ka-1-Data.db
Note that same table (Type A , B , C) type can be copied to same table type in other keyspaces (Type A , B , C) . The table name also need changes in file name , please note in example 12323 has been renamed to 12523 when copied to diretory newkeyspace_456/tableA-12523.
Type A table files can be copied from keyspace/tableA-12323 to Type A table files in newkeyspace_456/tableA-12523.
How do I approach this problem?
Thanks
tom
Use parameter expansion with string substitution for changing filename, like this:
for fn in $(find ./keyspace -path '*.db') ; do cp "$fn" "${fn//keyspace/newkeyspace_456}" ; done ;

Resources