Now Recruiting
I am now recruiting students willing to undertake a Masters or PhD degree in the fields of Information Retrieval, Data Mining or Machine Learning. Interested students should contact me and provide a current resume outlining relevant academic and professional history.
How Confident Is Your Multi-Label Classifier? Estimating Expected Accuracy from Label Distributions
When a multi-label classifier makes a prediction — say, flagging a patient record for Diabetes, Hypertension, and COVID-19 — how confident should you be? This question is harder than it looks. In a single-label setting, the probability score attached to a prediction is a straightforward measure of confidence. In the multi-label world, it gets complicated fast.
A new paper by Laurence A. F. Park (Western Sydney University) and Jesse Read (École Polytechnique) takes a rigorous look at this problem, testing seven candidate functions for estimating expected accuracy from a multi-label probability distribution — and finding clear winners depending on how accuracy is measured.
Data Science Research Group
If you have a new idea that will advance the field of data science, why not present your idea in one of our Data Science research group seminars. If you are interested, contact me to get the process started.
textIR 0.5 Released
The latest version of textIR has been released. The new features include indexing, retrieval and topic modelling of text document sets. More information can be found at https://www.scm.uws.edu.au/~lapark/textIR.