XD blog

blog page

javascript, programming, python, web server


2013-07-27 Logging click events on your server

Many websites log events, where users clicked for example. They want to optimize for a better use. You would assume every time a user requests a page, your server needs to provide the user with the content of the page. However some cache mechanism could prevent you from getting that information, a user could click on a link leading outside your website or the same page could be obtained from different others pages. You need a more precise information. How to log a click event then?

To do that, we first need to do something when a user clicks on a url: we need to catch this event and to call another function. We use the following syntax:

<a href="url" onmousedown="sendlog('url')">anchor</a>
The function sendlog will be executed when the user clicks on this particular url. The string between the quotes is the information to log. The function sendlog is defined in another file, defsendlog.js in this case. The following lines must be added to the HTML page (header section):
<script type="text/javascript" src="/defsendlog.js"></script>

This file will contain the description of the function. Basically, it consists in calling your server with a url which only has a meaning for you. If you do not want people to understand what you really send to your server, the information should be encrypted in the web page.

function sendlog(info) 
{
    var url = 'logs/click/' + encodeURIComponent(info) ;
    var pageRequest = new XMLHttpRequest()
    pageRequest.open('GET', url, false);
    pageRequest.send(null);
}
It might be useful to introduce a random value in the url, completely meaningless but it will avoid caching.

If your server is logging every request, you should see a line such as this one:

2013-07-27 21:08:07 - 127.0.0.1 - "GET /logs/click/%2Frss_reader.html HTTP/1.1" 200 -
In Python, you can easily your own server by using the class BaseHTTPRequestHandler (see the following example: BaseHTTPServer.BaseHTTPRequestHandler). You just need to overload method do_GET to do something def serve_content(self, path, method = "GET"):
class MyHandler(BaseHTTPServer.BaseHTTPRequestHandler):
    # ...
    def do_GET(s):
        path =  urlparse(self.path)
        if path.path.startswith("/logs/") :
            s.send_response(200)
            s.send_header("Content-type", "text/html")
            s.end_headers()            
            
            url  = path.path[6:]
            info = urllib.parse.unquote(url)
            # do something with info

Usually, it is convenient to log identified information. Not that we would know who is the user but we would be able to group all events coming from the same user for a short period of time (a session). We need to log an id for every user. The following function will generate such an id (I got it from here):

function generateUUID()
{
    var d = new Date().getTime();
    var uuid = 'xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx'.replace(/[xy]/g, function(c) {
        var r = (d + Math.random()*16)%16 | 0;
        d = Math.floor(d/16);
        return (c=='x' ? r : (r&0x7|0x8)).toString(16);
    });
    return uuid;
}
We then use cookie to store and retrieve this id all along a session:
function getUUID()
{
    var uuid = readCookie("rssuuid") ;
    if (uuid == null) {
        var uu = generateUUID() ;
        createCookie("rssuuid", uu) ;
        return uu ;
    }
    else {
        return uuid ;
    }
}
We finally modify the function sendlog to add this id:
function sendlog(link) 
{
    var info = 'logs/click/' + getUUID() + '/' + encodeURIComponent(link) ;
    var pageRequest = new XMLHttpRequest()
    pageRequest.open('GET', info, false);
    pageRequest.send(null);
}
Here is an example of what you could get:
dtime                        uuid                                   type1   type2   args                                                 id_event
2013-07-28 12:31:05.720984   5c469967-1dbe-4214-ea2a-dfa7bb8e1be4   imp     url     http://localhost:8080/rss_reader.html                1
2013-07-28 12:31:19.011383   5c469967-1dbe-4214-ea2a-dfa7bb8e1be4   click   post    354/in                                               2
2013-07-28 12:31:20.642292   5c469967-1dbe-4214-ea2a-dfa7bb8e1be4   click   post    354/in                                               3
2013-07-28 12:31:21.295469   5c469967-1dbe-4214-ea2a-dfa7bb8e1be4   imp     url     http://localhost:8080/rss_reader.html?blog_selected=&post_selected=354   4
2013-07-28 12:31:32.834610   5c469967-1dbe-4214-ea2a-dfa7bb8e1be4   click   post    354/out                                              5
2013-07-28 12:31:39.107667   5c469967-1dbe-4214-ea2a-dfa7bb8e1be4   click   url     rss_reader.html?search=today                         6
2013-07-28 12:31:39.458983   5c469967-1dbe-4214-ea2a-dfa7bb8e1be4   imp     url     http://localhost:8080/rss_reader.html?search=today   7


<-- -->

Xavier Dupré