I wonder do you know where I can quickly register and get different VAST 2/3 feeds? I need to add VAST/VPAID support to a video player and I'm looking for an example feeds.
I see that IAB have for VAST2, but not for vast 3. http://www.iab.net/guidelines/508676/digitalvideo/vast/vast_xml_samples
I requested registration from https://www.google.com/doubleclick/publishers/index.html but still waiting for 2 days :(
Thanks for
Forget DC as sample feed, there is no one actually, becayse all the creatives are objects of licensing, here some links with VAST 2.0 XML examples:
http://demo.tremorvideo.com/proddev/vast/vast_inline_linear.xml
http://demo.tremorvideo.com/proddev/vast/vast_wrapper_linear_2.xml
https://github.com/dailymotion/vast-client-js/tree/master/test (Some sample XMLs)
https://github.com/theonion/videojs-vast-plugin/blob/master/spec/sample-vast.xml
Actually if you surf GitHub you'll find out that there is pretty decent amount of VAST 2/3 XMLs .
Related
Site : www.purelocal.com.au
Tested 1000's of URL's in Google PSI - all are Green 90%+.
However , in Google webmaster tools = 0 GOOD URLS.
Can someone please explain what Google requires and what we can do to pass core web vitals before JUNE ?
We've spent months optimising everything and cannot further optimise but Google says that NONE of our URL's pass core web vitals...it's just ridiculous.
Looking at your website's report in the CrUX Dashboard, there are a couple of things you could optimize more:
First, your site's LCP is right on the edge of having 75% good desktop experiences, and phone experiences are below that at 66% good. https://web.dev/optimize-lcp/ has some great tips for addressing LCP issues.
Second, while your site's desktop FID experiences are overwhelmingly good (98%), you do seem to have a significant issue for phone users (only 44% good). There are similarly great tips in the https://web.dev/optimize-fid/ article.
While the big green "98" score on PSI makes it look like the page is nearly perfect, what matters most in terms of the user experience is real field data. That information can be found in the "Field Data" and "Origin Summary" sections of the report.
You mentioned in the comments that your server response time is an issue. I can confirm this with lab testing:
https://webpagetest.org/graph_page_data.php?tests=210506_AiDcJ0_0283e8c51814788904bdf19cebe7a5c8&medianMetric=TTFB&fv=1&median_run=1&zero_start=true&control=NOSTAT#TTFB
https://webpagetest.org/result/210506_AiDcJ0_0283e8c51814788904bdf19cebe7a5c8/8/details/#waterfall_view_step1
The long light blue bar on line 1 of the chart above shows how long it takes your server to respond to the request. In this case the time to first byte (TTFB) is 1.132 seconds. This is going to be a huge problem for most users to achieve a fast LCP because in these tests it takes 1.9 seconds just to get the HTML to the client. No amount of frontend optimizations can make the HTML arrive sooner than that. You need to focus on backend optimizations to get the TTFB down.
I can't give you any specific hosting recommendations but it does seem like the shared hosting is adversely affecting your users' LCP performance.
I am looking for algorithms that allow text extraction from websites. I do not mean "strip html", or any of the hundreds of libraries that allow this.
So for example for a news article I would like to identify the heading and all the text, but not the comments section and so on.
Are there any algorithms for that out there? Thank you!
In computer science literature this problem is usually referred to as the page segmentation or boiler plate detection problem. See the report Boilerplate Detection using Shallow Text Features and its related blog post. Also, I have a few reports and software sites bookmarked that address the problem. Also, see this stackoverflow question.
there are a few open source tools available that do similar article extraction tasks.
https://github.com/jiminoc/goose which was open source by Gravity.com
It has info on the wiki as well as the source you can view. There are dozens of unit tests that show the text extracted from various articles.
"Content extraction" is a very difficult topic. There are no common standards to identify the "main-article" content (there are several approaches to make HTML easier readably for crawlers, e.g. schema.org, but none of these is very popularly used).
So it turns out, that if you want good results, its probably best to define your own XPath selectors for each (news) website you want to scrape. Although there are some APIs for HTML content extraction, but as I said its very hard to develop an algorithm which works for every site.
Some APIs you could use:
alchemyapi.com
diffbot.com
boilerpipe-web.appspot.com
aylien.com
textracto.com
What you're trying to do is called "content extraction". It turns out to be a surprisingly hard problem to solve well, and many naive solutions do quite badly.
Instapaper and Readability both have to solve this, and you may learn something from looking at their solutions. They also both provide services that you may be able to take advantage of - perhaps you can outsource your problem to them and let their API take care of it. :)
Failing that, a search for "html content extraction" returns a great deal of useful results, including a number of papers on the subject.
I compared a few different libraries, and had really great luck with Mozilla's Readability library (Node), or its Python wrapper.
For example, take this CNN article: https://edition.cnn.com/2022/06/01/tech/elon-musk-tesla-ends-work-from-home/index.html
Readability successfully returns only the relevant data:
New York (CNN Business) Elon Musk is demanding that Tesla office workers return to in-person work or leave the company. The policy, disclosed in leaked emails Musk sent to Tesla's executive staff Tuesday, was first reported by electric vehicle news site Electrek. "Anyone who wishes to do remote work must be in the office for a minimum (and I mean *minimum*) of 40 hours per week or depart Tesla. This is less than we ask of factory workers," Musk wrote, adding that the office must be the employee's primary workplace where the other workers they regularly interact with are based — "not a remote branch office unrelated to the job duties." Musk said he would personally review any request for exemption from the policy, but that for the most part, "If you don't show up, we will assume you have resigned."
etc.
I think your best shoot is study what information can you get from the metadata and write a good html parser, oEmbed could be a good standard =)
https://oembed.com/#section7
I would like to know as a newbie programmer what the benefits are of using for example google search API or newest buzz API for data content gathering instead of screen scraping; obviously apart from the legal aspects.
API's are less likely to change than a screen layout.
One big downside of screen scraping is that the screen can change and break your scraper. So you end up having to continually adjust your code to match theirs, and since you don't know about changes ahead of time, you suffer downtime/outages as a result.
Also, you may be violating their TOS, and they won't like it. If you have paying customers for your service, you can find yourself between a rock and a hard place pretty quickly.
Also, if you're simulating many users, you'll produce an unanticipated drag on the servers. So using a published/permitted API would be much more efficient for you, and for the web site serving up the source material.
I would really appreciate your help with finding out how long it takes a 1-3 year experenced programmer to convert a few HTML pages into joomla 1.5 dynamic pages. I know that some of it depends on how complex the pages are but i'm talking about average pages. That's my first question, my other question is how long will it take a 1-3 year experenced programmer to install all of these componants: Video module, photo gallery module, vertuemart shopping cart. I pay programmers to do this work but i have to make as sure as i can that i'm not over paying them. Thanks in advance for answering these two questions...George
Depends on complexity and quality of html/css design. Usually 1+ hours, if you want additional modules styled( K2, etc..) you need to add extra time, if style is different for every page, than it take more, plus configuration. Basically conversion is not that difficult, just replace main text with content and add some blocks/regions. I would say average about 8 hours
As already mentioned, it depends... there's not really anywhere near enough info to give even a rough estimate.
Is Joomla already installed? If it is installed and the desired template is in place, then cutting-and-pasting some page content can take a few minutes if it's just text. If not, allow 2-3 hours for basic installation, including debugging and testing of the standard components like sending email. Then another 1-2 hours for basic installation of components. Testing, debugging and setup of Virtuemart can take a lot longer, depending on what options for shipping and payments you want.
If you're using a good pre-built template that you're 100% happy with then there should be little to do there, but just positioning and adding discovering what menus work best in which module positions can take a lot of time. Often there is no support in a template for particular components so further styling for the additions is required. Purchased templates vary wildly in quality, some are just not worth the effort, and sometimes template developers take quick-and-dirty shortcuts to get components to look good in their demo, but can take hours to sort out to be useful for anything else.
If you want the Joomla site to look like your old site, or have a custom template built, or radically convert an existing template it can take the length of a piece of string. (One of my clients will easily spend 20 hours endlessly asking me to slightly change spacing, fonts, and colors after signing off on a design and promising that he wouldn't make any more changes. I guess because he can't visualize how things look until he sees the completed site.)
There are plenty of good photo galleries that are bug-free. That shouldn't take long, especially if it comes with a template you already like.
So you may be wondering why all the estimates above vary so much. It just depends on what you've got and what you are really looking for, and what experience the "programmer" has, if it is even in Joomla, or some other CMS, or PHP, or whatever.
Step one in a project like this is to find a programmer you trust not to rip you off; get reliable references if it's someone new. Then get as good an estimate from him or her as he can give, bearing in mind he might have no experience of how you work or how detailed you've been in laying out the plan. Getting an estimate from someone else is not worth much. Then get progress reports as you go along to see where the hours are going so you can judge how to proceed cost-effectively.
is hard to answer at your Q's, the info providet by you is insuficient to make an estimation.
You didn't mentioned anything about customization, complexity of functionalities, integration of those component. Also the time frame depend on the programmer experience and knowledge in Joomla! not just Html knowledge.
Usually, to install some component is easy, lets say for the components you mentioned i need something like 6-8 h, but this is just the installation process. From here to a good joomla! website is a long way. The more time consuming is the customization and integration of all the functionalities, and this depend on clients requests.
You mentioned Virtuemart, this also can be a bottle neck, Virtuemart installation depend on the shop categories and products no., shipping integration, payment method, images processing,
Other Issue can be the template integration, for a good website is better to have the same look on all the components, VM, Photo gallery etc Your template is acquired one or is a custom development?
But to answer at your Q 1,
1) 5 to 10 pages should take 2-4h (text edit, custom typography)
2) 40 to 80h for Video module, photo gallery module, Virtuemart shopping chart
Keep in mind that is just a rough estimation.
Roland
In my opinion, it should take no more than 2-3 hours with minimal configuration included. The only thing that would need more time than this is the video component. Configuration of these components should make the time vary.
I've read about systems which use the Flickr database of photos to fill in gaps in photos (http://blogs.zdnet.com/emergingtech/?p=629).
How feasible is a system like this? I was toying with the idea (not just a way of killing time but as a good addition to something I am coding) of using Flickr to get photos of a certain entity (in this case, race tracks) and reconstruct a model. My biggest concern is that there aren't enough photos of a particular track and even then, it would be difficult to tell if two photos are of the same part of the racetrack, in which case one of them may be irrelevant.
How feasible is something like this? Is it worth attempting by a sole developer?
Sounds like you're wanting to build a Photosynth style system - check out Blaise Aguera y Arcas' demo at TED back in 2007. There's a section about 4 minutes in where he builds a model of the Sagrada Família from photographs.
I say +1 for photosynth answer, its a great tool. Not sure how well you could incorporate it into your own app though.
Its definately feasable. Anything is possible. And yes, doable for a single developer, just depends how much free time you have. It would be great to see something like this integrated into Virtual Earth or Google Maps Street View. Someone who could nail some software like this could help 3D model the entire world based purely on photographs. That would be a great product and make any single developer rich and famous.
So get coding. :)
I have plenty of free time, as I am in between jobs.
One way to do it is to get an overhead view of the track layout, make a blueprint based on this model, and then get one photo of the track and mimic the track's road colour. That would be a start.
LINQ to Flickr on codeplex has a great API and would be helpful for your task.