How do you prevent gaming of page views? - algorithm

Say I have a site with pages. Pages are ranked based on the number of times they have been viewed. It is good for a page to be highly ranked because it will make it show up higher in my search results. Hence, the author of a page may try to game the system to increase that particular page's views.
So how do you prevent that while still keeping a quasi-accurate count?
I have come up with the following "scheme":
A user can only affect the page view once per session. This is what I would normally expect. If a user returns to the site later and views the page again, it should count as another page view.
The problem is that this makes the page view increment vulnerable to a script that clears its cookies before each request. The easiest solution to this problem would be to save the ip-address and only allow the same ip-address to increment page count once. This however has several major drawbacks; First of all, this would potentially take up a lot of storage, and second of all would prevent users on big LANs from incrementing page count. Lastly, a user cannot revisit a page and increment the page view more than once from the same ip. I can live with that, but would rather live without it.
The best method I can come up with off the top of my head would be to save the last X ip-addresses, and not let anyone from these ip-addresses affect the page view count. This would effectively stop any (simple) script from raising the page view count. Furthermore it would probably be a good idea to add a delay to the display of actual view count (basically keeping two counts and a datetime field for when the "display" count was last updated with the "actual" count, something I believe is done on the SE sites).
This is not a perfect solution, so I would be happy to hear your suggestions and/or comments.

Don't prevent: monitor and handle.
I would use a very different approach. Let the page views stay the same, but have reporting in place to looks for view-gaming. If a page gets gamed, you can find out who is responsible, give them a warning and a page-view penalty. If it continues, ban them.

I think that you should consider the reported characteristics of the browser as well. Browser fingerprinting has been done before and is well publicized. You can then figure out some pretty advanced heuristics on determining whether the same user is trying to game you. But don't publicize that you're using browser fingerprinting of course. Also, it won't stop incognito mode, but I'm just trying to give you one more avenue of thought to follow, in addition to your current IP oriented strategies.

Related

best practices with single page websites to decrease page load time without hurting SEO

I'm Wondering what are some best practices to decrease page load time of single page websites, and doing so in a way that won't hurt with SEO.
I'm leaning toward an ajax solution with "hijax linking", but I'm wondering what are some best practices in terms of the load order for a page. So for instance, say I have a simple webpage- has home, about, pictures of my cat, contact etc. and I'm planning to have it all show up on the homepage via vertical scrolling-alotting one "screen" worth of content per item.
I'm coding this in wordpress, so my main idea would be to first load the first "screen" i.e. hero section of homepage, as part of the home.php, so the user doesn't have to wait for the whole thing-and SEO. Then once that has finished loading, to load the next four via ajax, in the background. So I'm wondering what the best strategy might be to go about that. Someone provided this answer elsewhere:
"Build a standard 5 page site using php with proper separation of header, footer, content. Then use javascript to redirect to a single (separate) page with all content include()ed on the page."
In wordpress I'd take this to mean. Create a seperate page with a loop the grabs the other four "screens" as posts. and then load this page, after home.php has loaded.. Does anyone see any issues with this approach, or as the question asks, have any better or best practices to accomplish this, I'd appreciate them. Thanks.
There are several things you can do:
Need to improve the performance of your back end code in case there
is any.
Pagination: split page in smaller pages
Caching
Decrease the size of content, decrease the size of background images, compress js content
Compress Content
Most of the time the perfect optimization will depend on your situation. To start with one of the above will do it for you.
Your question is tagged with "wordpress". Therefore, I am assuming that you use wordpress.
if so, what I would think as logical starting point is to use one of the wordpress caching plugins. I use Quick Cache for my website and it makes significant difference.
But, you shouldn't stop with the plugin. Consider the quality of the theme you are using. You must be sure that the theme is good quality. Poorly designed themes may make inefficient database call and may slow your website.
delaying and Loading part of the page with ajax shouldn't be your first optimization action. Try all the other options first.

detect if site is being accessed via iframe? embed widget with shopping cart

I have a shopping cart I want to embed in a widget/iframe on other users sites, I see three ways of doing this each with drawbacks. Here are options from estimated most to least work.
Recreate interactive shopping cart UI in javascript widget then pass values to server script with AJAX, variables are passed to the main site, when user clicks "checkout" the user is then redirected to main shopping cart site with variables populated from what the entered in the widget.
pros: complete experience
cons: most work to complete creating UI and AJAX request.
Somehow detect if user is coming to shopping cart via iframe, if this is the case have alternate code that opens new window when user clicks "checkout" redirecting user to secure page and getting variables from cart via AJAX to populate final checkout.
pros: mid amount of work, must do AJAX request to get variables from shopping cart to populate final checkout
cons: can we easily detect if site is being accessed from a user within an iframe on another site?
complete entire checkout process inside iframe/widget.
pros: least ammount of work, just embed cart in iframe
cons: will not show https in browser user may be reluctant to purchase
What is the best option?
If you could provide a bit more information, maybe I could offer you an even better option. For starters, what have you built this application with (languages/framework)? Also, would you say your application's functionality is similar to Shopify's in that you allow users to host e-commerce sites through your service? If not, tell us a bit more about your application.
Here's a quick response to the options you provided.
option 1: the only real option as I see it. Whether you're embedding the shopping cart in specifically an iframe or rendering it onto the user's page as part of a template, you should be navigating the customer away to your main site to complete the checkout process. Or at least give them a lot of screen real-estate to work with (a sizable modal for example).
option 2: is messy. You can tell if a request is coming from a remote form (like an iframe) by appending url parameters. But taking the approach you're suggesting with this doesn't make too much sense.
option 3: too heavy unless you take a modal-approach like what I mentioned in response to option 1.
That being said, if you are building an application like Shopify, you should be able to build a template for each user's website that has a section dedicated to displaying a shopping cart pertaining to the current customer's session. No iframes or widgets necessary with this approach. But again, it all depends on the use cases of your application.
If your only concern with Option 2 is detecting if your content is being loaded within an iframe, you can do that with JavaScript by using "top.frames.length" or "top === self."
For example, you could show or hide different conditional form content, or a different submit button, using the following:
if (top.frames.length == 0) {
// Show content if not embedded in an iframe.
document.getElementById('embedded-content').style.display = "none";
document.getElementById('unembedded-content').style.display = "block";
}
else {
// Show content if embedded in an iframe.
document.getElementById('embedded-content').style.display = "block";
document.getElementById('unembedded-content').style.display = "none";
}
As you've stated, the first option is the best in terms of user experience and the most likely to achieve the highest possible conversions. How much better the conversion is compared to the next best solution cannot be objectively measured, as it involves recurring customers, your own brand name, the kind of products, etc. Since the conversion rates will directly affect you (and your company), it's wise to make an estimate first to see if your efforts spent will be worth it in the short and long term.
The second option is the sweet middle ground; you still get brand recognition and customers will have some security reassurance (via address bar); (i)frame detection is easily done by a simple JavaScript comparison: top === window. However, you're losing the continuity and hence likely lose some conversion. If this risk is manageable, I'd go for this option in the short term.
Not being able to see the security certificate directly via the green lock makes the third option the least desirable. However, not all is lost; by clever use of imagery you can still gain some trust with your end-user, as outlined in this image, which is part of a great article from Smashing Magazine.
Your decision should be based on:
what can be done in the short term
what should be done in the long term
how important is secure visual cues to my potential customer
time / money spent on either solution versus revenues (break-even analysis)

Adsense revenue depending on use of Ajax

I noticed that a website like imgur.com displays ads on each page of the website.
This means each time you press "next" to view another funny picture, AdSense refreshes.
But a website where you can scroll to view more pages(such as 9gag.com),
Ajax handles loading of more funny pictures so it's illegal to refresh Adsense when a user scrolls for more funny pictures.
Does this means 50 users staying on 9gag.com for 3 hours scrolling and viewing 300 funny pictures would help 9gag.com generate revenue equal to ONE imgur.com user that views only 1 picture?
Does this also mean I should stay away from Ajax if I wanted revenue?
This was very confusing for me, please help me understand AdSense better.
Thank you!
WEll the problem with fully scripted ajax loaded content is that Adsense cannot read it. Therefore it has a hard time displaying relevant ads, because most advertisers have chosen to target the visitor location and the keywords on the pages. So if Adsense has no text, then most of the time it's not going to be able to serve an ad.
But I looked at 9gag.com and they are using what I think is the ajax version of Adsense, or perhaps the premium version of Adsense which allows for all sorts of things and is quite different from the core Adsense program in many ways that nobody seems to know about, and few are invited. All the big publishers I suppose.
Anyway, if you do end up clicking on one of the posts on 9gag.com you'll see other ads. Granted that the way that imgr.com has things set up should encourage more content viewing per visitor and thus also some more ad viewing, but I wouldn't say that one necessarily has more traffic overall than the other. There are too many unknown factors to determine that. Not something you can do just with looking at a site. That is where having good analytics of your traffic and visitor behavior comes in.

Event tracking and virtual pageviews are not tracked in Google Analytics

I'm trying to track how visitors interact with the price calculator that i placed on my website.
I've tried placing events and virtual pageviews in different places and events (onClick, onMouseDown, onMouseOver, in href attribute, onChange in the input tag). No matter what i do - no events or virtual pageviews are tracked, though i can see the __utm.gif requests for everything i want to track in FireBug, but nothing in GA reports.
Here's the calculator i'm tracking (it's in Russian, the event i'm trying to track is the big orange button).
Firstly, I do see a _trackPageview() call passing "/virtual/trees/calculate" on various onmouseover,onchange, and onclicks, and at face value I see no reason you shouldn't be seeing "/virtual/trees/calculate" show up in your pages report, but google officially states that it takes up to 24 hours to see data.
Second, I do not see any event tracking on your page. I do not see any code for it, nor do I see any GA calls showing it from random interactions on your page. If it is there, you will need to give detail about where it is and how it is coded.
Third, do you see the page view for the actual page? Which account/profile are you looking at? Because when I first load the page, I see two separate hits to GA happening, the first to account/profile # "UA-25026876-1" (which is from your on-page code) and the second to account/profile # "UA-20200270-1" (which is happening from a counter.js script include), and the second one is where your virtual page views are going to.

UI - How I can make users effectively read what my program says?

I have a simple form that searches through the 2000+ issues of a 3rd party webcomic. (Easy, it's like xkcd: http://url/number
That form is as easy as possible, is like this:
What number do you want?
User writes a number, clicks ok, and goes on the 3rd party website on a new tab
Then, my form asks a question: "Did you find that issue memorable? Enter the name here, and we will add it to the "best issues" in home page"
When the user will write the name of the issue, it is added to the database (pending moderation by me)
So, I supposed this design is the easiest and convenient that users can find.
Unfortunately, NONE of the users (maybe a 2% behaved correctly) will actually read what I asked. Some of the issues are offline, and gives a 404. On that issues users will write in the textbox a completely wrong title, and correctly capitalized!
It's like if i would name http://xkcd.com/627/ as "The Great Adventures of Jack Smith"
Users are from around all over the country, with different browsers, and have a different cookie.
I cannot believe that my users will not read what I ask, it is a WHITE PAGE with a button that disappears when clicked and a textbox.... easier than that???
Maybe i should put a checkbox with "I acknowledge that this form is for submitting memorable issues, not for fun"? Oh, who will read that?
Or maybe i could enable the textbox only if the user has effectively clicked the link?
Do your users understand your site/service?
I, for one, don't remember (web-)comics by their issue number, but by their content. When asked what xkcd comic number I would like to see, I'd probably input random numbers like 42, 123 or 666 or something.
After you make me guess for a number you ask me if the associated comic is particularly epic, then you ask me to do some data entry for it to put it on some kind of hall of fame. Honestly I do not understand what the logic is behind inserting titles for non existing comics -- are you sure they don't actually land them on the comic page for "The Great Adventures of Jack Smith"? The 2% of your userbase probably noticed the issue in the URL you generated for them, addressed it and typed in the right title. Or, maybe, they are typing the name of the comic they actually wanted to see instead.
There's a simple way to know. Have your mom use it and do not correct her if she makes mistakes. All mistakes she makes are your fault, not hers.
Without having the text of the labels you have put it's harder for us to second guess what's going wrong than it is for you.
Try it!!
You could try parsing the title of the page and obtaining the title yourself
OR you might want to request the username/handle.
Once the user enters the details and clicks SUBMIT, Show a confirmation page ( preview of how the submission will be listed). Make sure to include the username/handle as the person who submitted it (This brings a sense of responsibility to the guy who submits). Remember to keep a back button to allow the user to go back and make the necessary changes ans submit again.
Allow users to create profiles on ur site (they maybe as simple as stackoverflow's profile system. here's mine for example). Unless he is logged-in, submissions posted as anonyomous. Rest same as above.
NOTE: There might be a slim possibility that, U are be being targetted by spam / captcha bots. Hence the random text entries. still. do implement the above. A better UI never hurt anyone. Right??...

Resources