Online dating is becoming an increasingly popular method to meet a potential partner, with 1 in 5 committed relationships now beginning online. Sites such as eHarmony, Tinder and Match.com use a range of techniques to help users find a potential match. Most systems compare location and age and often users with similar preferences will be paired. The aim of this project was to build a predictive model that could help improve the pairing mechanic of these websites. By improving match success, these companies can deliver a more tailored service and boast a better success rate for users, increasing their value and popularity. This was a group project in collaboration with Max Hunt and Will Kerr.
Having carefully considered and analysed our test data using machine learning techniques, it was determined that age difference, enjoyment of going out and sincerity were the most important factors to consider when predicting a successful match from the data set used. Our final accuracy, precision and recall for our logistic regression model on the test data was 0.59, 0.59 and 0.51 respectively. When applied to a real world scenario, the P-value for the ‘match’ threshold will be adjusted to minimize False Negatives in favor of False Positives. This reasoning comes from the idea that the user would rather experience a bad date than not be given the chance to experience a good one. The accuracy score achieved is close to the 0.5 mark of random prediction for a binary outcome, this could reflect the idea that when someone chooses a match, it is an instinctive decision; it could be predicted that two people are extremely likely to match and common interests are a factor, however the ultimate decision is likely to be a human gut feeling, which is inherently random and difficult to predict.
Click to see our Report