In my site built on Ruby on Rails, I need to provide functionality to trim songs (say first 20 sec). Does anybody know any relevant API to manipulate songs (like 'rmagick' for images)?
You could try https://github.com/fugalh/ruby-audio. It looks a little out of date, but there's probably a fork with updates.
Another solution might be to limit how much the song plays via javascript.
And yet another option might be just to make the snippets yourself.
Related
I like reading the PoC||GTFO issues and one thing I found remarkable when I first discovered it, was the "polyglot" nature of their PDF files.
Let met explain: when you consider for example their 8th issue, you may unzip files from it; execute the encryption they are talking about by running it as a script and even better(worse?) with their 9th issue you can even play it as a music file!
I'm currently in the process of writing small scripts every week and writing each time a little one page PDF in LaTeX to explain the said scripts. So I would really enjoy being able to create the same kind of PDF files. Sadly they explained (partly) in their first issue how to include zip files, but they did so through three small sketches of cmd lines without actual explanations.
So my question is basically :
how can one create such a polyglot PDF file containing stuff like a zip as well as being a shell script which may be run using arguments just like normal scripts?
I'm asking here about the process of creation, not just an explanation of how this is possible. The ideal way for me would that there are already some scripts or programs allowing to create easily such PDF files.
I've tried to search the net for the keywords "polyglot files" and others of the kind and wasn't able to find any useful matches. Maybe this process has another name?
I've already read the presentation by Julia Wolf which explains how things works, but I sadly haven't had time to apply the knowledge there to real world, because I'm sadly not used to play with file headers and the way a PDF is constructed.
EDIT:
Okay, I've read more and found the 7th edition of PoC||GTFO to be really informative concerning this subject. I may end up being able to create my own scripts to do such polyglot PDF files if I have some more time to consider it.
I played around with polyglots myself after attending Ange's talks and also talking to him in person. You really need to understand the file formats to be able to nest them into each other.
However, long story short, here are some links I found extremely useful for creating polyglots:
Some older Google Code Trunk
PoC of the polyglot stuff
Especially the second link (to github) will help you creating polyglots, but also understanding how they are working and how they are implemented. Since it is mostly Python stuff and very well / clean written, it is very useful and easy to follow.
I feel dissecting some file formats would be a good place to start. You can find many file format specifications for different file types through Google, but they can be a tough read and will likely take you some time to translate into whatever language you are using.
PDF: https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf
ELF: https://www.cs.cmu.edu/afs/cs/academic/class/15213-s00/doc/elf.pdf
ZIP: http://kat.sdf.org/zip_file_format.txt
The language(s) you select will need a way to read and write raw bytes (not just ascii alphanumeric), so perhaps C would be good for more direct access to memory. Some Python tricks could help with open sourcing the scripts easily.
To dissect the files, you may want to build a tool kinda like https://github.com/kvesel/zipbrk/ to take them apart, then put them all back together in a polyglot format. For example, zip does not require the section headers to be at the start (or even contiguous for that matter), and PDF magic number can appear in multiple places within the file as well. I also believe I recall a polyglot tool being included in one of the PoC||GTFO publishings (maybe issue 8 or 2??) as a polyglot in the pdf file.
Don't forget the hackers bible! :)
https://nostarch.com/gtfo
So what I would like to do is scrape this site: http://boxerbiography.blogspot.com/
and create one HTML page that I can either print or send to my Kindle.
I am thinking of using Hpricot, but am not too sure how to proceed.
How do I set it up so it recursively checks each link, gets the HTML, either stores it in a variable or dumps it to the main HTML page and then goes back to the table of contents and keeps doing that?
You don't have to tell me EXACTLY how to do it, but just the theory behind how I might want to approach it.
Do I literally have to look at the source of one of the articles (which is EXTREMELY ugly btw), e.g. view-source:http://boxerbiography.blogspot.com/2006/12/10-progamer-lim-yohwan-e-sports-icon.html and manually programme the script to extract text between certain tags (e.g. h3, p, etc.)?
If I do that approach, then I will have to look at each individual source for each chapter/article and then do that. Kinda defeats the purpose of writing a script to do it, no?
Ideally I would like a script that will be able to tell the difference between JS and other code and just the 'text' and dump it (formatted with the proper headings and such).
Would really appreciate some guidance.
Thanks.
I'd recomment using Nokogiri instead of Hpricot. It's more robust, uses less resources, fewer bugs, it's easier to use, and faster.
I did some scraping extensively for work on time, and had to switch to Nokogiri, because Hpricot would crash on some pages unexplicably.
Check this RailsCast:
http://railscasts.com/episodes/190-screen-scraping-with-nokogiri
and:
http://nokogiri.org/
http://www.rubyinside.com/nokogiri-ruby-html-parser-and-xml-parser-1288.html
http://www.engineyard.com/blog/2010/getting-started-with-nokogiri/
I'm building a site that allows users to make sense of a debate by graphically representing arguments for and against a particular issue. (Wrangl)
I'd like to categorise these debates so they are more easily found and connected. I don't want to irritate the person creating the debate by asking them to add tags and categories before they see any benefit, so I'm looking at a way of automatically extracting keywords.
What's a good approach for taking the debate's title and description (and possibly the content of the arguments themselves once there are some) to pull out, say, ten strong keywords that could be used as metadata to connect similar debates together, or even as the content of the "meta" keywords tag in the head of the HTML page where the debate is viewable. Eg. Datamapper vs ActiveRecord
The site is coded in Ruby with Sinatra, using DataMapper for data storage. I'm ideally looking for something which will work on Heroku (I don't have a way of writing files to disk dynamically), and I'd consider a web service, an API or ideally a Ruby gem.
Maybe you can use TextAnalyzer.
I understand that you're wanting to find an easy way of achieving this, I've recently dived into the world of NLP (Natural Language Processing) and Text-mining and its a daunting process of which most went far above my head.
Although i managed to code some functionality that resembles what you're looking for, though I did it in PHP. What i would suggest, that if you want it tailored to your project (Wrangl) then do it yourself.
Using the Porter stemming algorithm which I'm sure there will be Ruby code for.
Ruby Porter stemmer
You can try the salsaAPI to automatically extract keywords and categorize the debates!
I have a shopify store that I want to automatically update the product variants inventory levels with, using a live xml feed from the wholesaler I use.
I'm learning to program (Ruby) and this is my first project, but after researching here is how I think it should work.
Use Ruby/Nokugiri to parse the XML feed from the wholesaler, and then Xpath to locate both the unique product variant SKU code, and the stock level.
Somehow I need to use this SKU to refer back to my Shopify store product XML list, and pull out the variants unique ID using the SKU code.
Then use something like the builder gem to build the XML format that shopify needs, and then use curl to PUT the changes. I'm guessing I loop this process for every product?
I know Shopify only has a 300 call limit, so I've got the article on putting a delay in the script, but I get the feeling the above method isn't the easiest way to go about this?
With Shopify you need to apply the variant stock level update against unique variant xml files, so I need to build the unique xml file/code and PUT it against /admin/variants/#[thevariantid].xml
I'm looking forward to trying to put this together and learning in the process, but am I on the right track with this? Are there simpler gems I should be looking at?
n.b I've only recently started learning Ruby, and will head to Rails afterwards. I know a bit about XML and it's structure so should be ok finding what I need with XPath.
You’re on the right track, but I’d use the shopify_api gem to do the talking to Shopify instead of having to form the XML and URIs yourself: https://github.com/Shopify/shopify_api
There’s an article on our wiki that might also help you out with regards to the API call limit but just let me know if you need more space – we’re pretty flexible and the limit is really just there to keep scripts from going wild and affecting service for everyone else.
Your proposed path seems good, except that there's no need to use the 'builder' gem, as Nokogiri has some very nice XML-building built into it.
I'm looking for a step to step tutorial to make an app, not so complex in Ruby, so students can do it. By now, i have only medium-big examples that i have developed for companies some years ago,but they require extra knowledge as i used diff frameworks and libraries and i want something that can be done only with the ruby interpreter itself.
A well commented app will be good as well as i can make some step-to-step guide based on that, and yea maybe I can do one but the thing is that im running out of time, and i haven't used ruby in like 1.5-2years, so as i said im looking for something not so complex and not so big, 200 , 300, 400, or 500 lines of code is ok
Could be anything, like administration or managing purpose like idk, a script that generates word documents for certain department. A script that reads a .txt or .doc and do something with that, idk.
Thanks in advance!
It's not an app really, but it's smallish, it's Ruby, it's sort of a game, and it's fun. http://github.com/ryanb/ruby-warrior