Distributing data on cluster (using torrents?)

Distributing data on cluster (using torrents?) - cluster-computing

I hope this is a good place to ask this, otherwise please redirect me to the correct forum.
I have a large amount of data (~400GB) I need to distribute to all nodes in a cluster (~100 nodes). Any help into how to do this will be appreciated, following here is what Ive tried.
I was thinking of doing this using torrents but I'm running into a bunch of issues. These are the steps I tried:
I downloaded ctorrent to create the torrent and seed and download it. I had a problem because I didn't have a tracker.
I found that qbittorrent-nox has an embedded tracker so I downloaded that on one of my nodes and set the tracker up.
I now created the torrent using the tracker I created and copied it to my nodes.
When I run the torrent with ctorrent on the node with the actual data on it to seed the data I get:
Seed for others 72 hours
- 0/0/1 [1/1/1] 0MB,0MB | 0,0K/s | 0,0K E:0,1 Connecting
When I run on one of the nodes to download the data I get:
- 0/0/1 [0/1/0] 0MB,0MB | 0,0K/s | 0,0K E:0,1
So it seems they aren't connecting to the tracker ok, but I don't know why
I am probably doing something very wrong, but I can't figure it out.
If anyone can help me with what I am doing, or has any way of distributing the data efficiently, even not with torrents, I would be very happy to hear.
Thanks in advance for any help available.

but the node thats supposed to be seeding thinks it has 0% of the file, and so it doesn't seed.
If you create a metadata file (.torrent) with tool A and then want to seed it with tool B then you need to point B to both the metadata and the data (the content files) itself.
I know it is a different issue now, and might require a different topic, but Im hoping you might have ideas.
You should create a new question which will have more room for you to provide details.

So this is embarrassing, I might have had it working for a while now, but I did change my implementation since I started. I just re-checked and the files I was transferring were corrupted on one of my earlier tries and I have been using them since.
So to sum up this is what worked for me if anybody else ends up needing the same setup:
I create torrents using "transmission-create /path/to/file/or/directory/to/be/torrented -o /path/to/output/directory/output_file_name.torrent" (this is because qbittorrent-nox doesn't provide a tool that I could find to create torrents)
I run the torrent on the computer with the actual files so it will seed using "qbittorrent-nox ~/path/to/torrent/file/name_of_file.torrent"
I copy the .torrent file to all nodes and run "qbittorrent-nox ~/path/to/torrent/file/name_of_file.torrent" to start downloading
qbittorrent settings I needed to configure:
In "Downloads" change "Save files to location" to the location of the data in the node that is going to be seeding #otherwise that node wont know it has the files specified in the torrent and wont seed them.
To avoid issues with the torrents sometimes starting as queued and requiring a "force resume". This doesn't appear to have fixed the problem 100% though
In "Speed" tab uncheck "Enable bandwidth management (uTP)"
uncheck "Apply rate limit to uTP connections"
In "BitTorrent" tab uncheck "Torrent Queueing"
Thanks for all the help and Im sorry I hassled people for no reason from some point..

Related

are COPY commands possible with MonetDBe-Python?

I was having some trouble bulk-loading records to go faster than what cursor.executemany would allow. I hoped the bulk operations documented with regular MonetDB here might work, so I tried an export as a test. e.g. cursor.execute("COPY SELECT * FROM foo INTO '/file/path.csv'"). This doesn't raise an error unless the file already exists, but the resulting file is always 0 bytes. I tried the same with file STDOUT and it prints nothing.
Are these COPY commands meant to work on the embedded version?
Note: This is my first use of anything related to MonetDB. As a fan of SQLite and a not-super-impressed user of Amazon Redshift, this seemed like a neat project. Not sure if MonetDB/e is the same as MonetDBLite - the former seems more active lately?

Exporting data through a COPY INTO command should be possible in MonetDB/e, yes.
However, this feature is not working currently. I was able to reproduce your problem, i.e. the COPY INTO creates the file where the data should be exported to, but doesn't write the data. This does not happen with regular MonetDB.
Our team is notified of this issue, and we're looking into it. Thanks for the heads up!
PS: Regarding your doubt about MonetDB/e vs MonetDBLite: our team no long develops and maintains MonetDBLite. Both are embedded databases that use MonetDB as the core engine, but MonetDBLite is deprecated. After having learnt some do's and don'ts with MonetDBLite, our team is developing our next generation of embedded databases.
So for your embedded database needs, you should follow what's coming out of our MonetDB/e projects.

I've created a test for it at: https://github.com/MonetDBSolutions/monetdbe-examples/blob/CI/C/copy_into.c
Also filed a bug report over on GitHub: https://github.com/MonetDB/MonetDB/issues/7058
We're currently looking into this issue.

SCCM OSD TS End User Summary Screen

I am looking for a good way to make a summary to existing large build TS.
What I am working with is SCCM 2012r2 and what I need is a hint, how to capture all steps I want(some of them are in various groups) and put result of them in some sort of variable so at the end, someone who is building that PC will have a table showing lets say 30 of applications green and 4 of them red as a failure.
Can it be done in some easy way? I just need someone building the PC to see what app didn't install so he can install it manually or at least provide me more information before I'll dive into logs.
Thanks

I wouldn't say easy because it requires lots of steps and you have to do it per application manually basically but there is a TS Variable _SMSTSLastActionSucceeded which you could check after each installation step (you have to set the step to continue on error to make this work). So basically after you tried to install you check whether it worked and then set a TS variable of your choice to reflect the failure.
As a final step you implement a script that checks all your TS variables and outputs the result.
You could even use the addon OSDBackground to display your errors as the background image.
Some lengthy article how to implement a form of error handling can be found here however you would have to do this quite a bit different because in this example the ts fails at the first error and you want to continue and log but you should get the basic principles.

System.log and Exception.log size and maintenance?

Our developer left a few weeks ago, so I have been attempting to get myself a little more acquainted with the systems and troubleshoot a few issues around the site.
I attempted to look at our log files in var/log/ before realizing that they are, erm, several GB each. Despite knowing almost nothing about how big an ideal log file should be, the fact that this file very handily crashed my computer seems like maybe it's too big. Before my computer crashed, I could see that there are records in the file that are over five months old.
Is it safe to simply delete system.log and exception.log?
When I search for this, I get a lot of results related to Log Cleaning...
In Magento there are settings under System to turn on 'Log Cleaning', but I suspect that has nothing to do with these two log files because a) it looks like it is set to clear it daily and b) cron is set up. If I am incorrect about that, please let me know so I can look into why that is not working correctly.

You guessed correctly. The log cleaning settings refer to tables of logs in the database. The files in var/log/ are safe to delete, as are most files in var/. If you do not need these files then it is simplest to turn them off from System > Configuration > Developer > Log Settings.
However, if you do think you want some logs just in case then learn how to use logrotate on your server, it can compress files and delete the oldest for you.

jenkins started with all jobs lost, trying to use 'copy existing job' feature

the CI server was disconnected for a while for some strange reason from the network and when it came back up, jenkins displayed with no jobs. however in the directory where the jobs live, /var/lib/jenkins/jobs/, the two jobs that should appear are there, but don't show any evidence of existence in the web client.
i tried using the 'copy existing job' and then pointed it to /var/lib/jenkins/jobs/existing_test but it tells me: no such job /var/lib/jenkins/jobs/existing_test
any suggestions as to how to get this to work ?

I know that question can be outdated, but a possible a solution is to run jenkins under appropriate user (the one it run previously). This helped me.

ended up just building the jobs brand new, wasn't able to find a fix

At first I would try and look in the jenkins logs, as your data is in /var/lib/jenkins I would guess your log files are in /var/log/jenkins. Maybe you can find out whats wrong from there.
Also you could try the "load configuration from disk" link in the "manage jenkins" view. That should try to reload the configuration files from your directories, and maybe bring your jobs back. Anyways, you should be able to see something in your logs. If the logs are empty check file permissions, I used to have problems with that after updating sometimes.

How do I schedule for all the files of one folder to be moved to another server?

I did a quick search and couldn't see anything that was relevant which I found strange as this seems like it'd be a common question. Maybe I'm just going the wrong way about it or being thick? Who knows.
Anyway, I am trying to set up a scheduled task that moves all the files in a folder from Server A to a folder in Server B. If this was a simple matter of copying them it would be fine as I'd already got that working using Core FTP and a batch file but I'd like them to be removed from Server A after the copy has taken place.
I was looking at the windows ftp commands but although I managed to log onto Server A successfully from Server B whenever I tried to do a command it just took a very long time and then disconnected.
Any help in this would be appreciated, I need it to be a schedule-able file but it doesn't matter whether it is a .bat, .vbs or anything else that I haven't though of?
Thanks,
Harry

You could use www.Dropbox.com
Why? For stability. Any home-brew ftp script that moves files, is prone to an undetected error in transmission, resulting in deleted files.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Distributing data on cluster (using torrents?) - cluster-computing

Related

are COPY commands possible with MonetDBe-Python?

SCCM OSD TS End User Summary Screen

System.log and Exception.log size and maintenance?

jenkins started with all jobs lost, trying to use 'copy existing job' feature

How do I schedule for all the files of one folder to be moved to another server?

Categories

Resources