University: University of St Andrews
Title: Improving anonymity in web browsing
Large amounts of information are exchanged through the Web every day and there are security mechanisms such as encryption to protect sensitive data. However, far less attention is dedicated to protecting users’ browsing patterns, which are not considered as sensitive as credit card data. Browsing patterns can be used for behavioural fingerprinting, which can compromise users’ privacy. This report assesses techniques that can be used to protect browsing behaviour data on the client side. A novel Bayesian Network model is presented which serves a dual purpose: it can de-anonymise users based on past experience, but it can also generate new, credible browsing histories. Generative capabilities of the model are then used to create several data spoofing algorithms that inject noise sampled from the model. The model also serves as a baseline for evaluating the spoofing algorithms.
Overall, four different spoofing algorithms are presented and compared. This report concludes that some of the discussed spoofing algorithms are effective against browsing behaviour-based de-anonymisation attacks.
The main hypothesis of this project was as follows. If a user is browsing from a shared device or a shared network, their browsing behaviour (websites the user has visited and timings of those visits) is sufficient to identify the user, regardless of the device/network they are browsing from. This way it may be possible to identify different browsing sessions of one user even if one was done from home and the other one – from a cafe, or any other shared network, where it is not immediately obvious which user is browsing by looking at the IP address or other network-related metadata. If it is possible to uniquely identify a browsing session as belonging to some user or even narrow down the set of possible users a browsing session can belong to, it can adversely affect web privacy.
The model created for this project proved successful at de-anonymising users, given only a list of URLs they have visited. The model performed well on both seen data (training data) and unseen data (new browsing histories, that were not used for training). Moreover, some of the examined privacy protecting algorithms showed promising results in terms of protecting users’ privacy against history-based attacks.
Daria completed school in Russia, and moved to the UK in 2015 to pursue a degree in Computer Science at the University of St Andrews. She received a Bachelor of Science in Computer Science degree (first class with honours) and was awarded the prestigious Principal’s medal. This award recognises students who display exceptional endeavour and achievement during their time at St Andrews, and only two were awarded by the University in 2019. During her studies, Daria has also received the Principal’s scholarship, awarded yearly to fifty final year students from across different departments of the University whose grades to date are the highest in their faculties. Since joining, she has consistently been placed on the Dean’s list recognising academic performance. In addition to the university awards, she has participated in various programming competitions and Hackathons and her team won prizes on two university Hackathons.
Daria has successfully completed two Software Engineering internships with Google, and is currently doing the third one over the summer. All three internships at Google involved working in different teams and allowed her to work with different technologies. During the first internship she implemented a new feature in the Google Trips mobile application and for her second intern project she developed a software component for a tool that does automated user interface testing. The improved implementation of the tool made code cleaner and more manageable by replacing a third-party library with a new implementation. The first internship included both front-end and back-end work. Currently she is interning as a Site Reliability Engineer.