ΑΙhub.org

deep dive

Optimizing LLM test-time compute involves solving a meta-RL problem

  20 Jan 2025
By altering the LLM training objective, we can reuse existing data along with more test-time compute to train models to do better.

Generating a biomedical knowledge graph question answering dataset

  17 Jan 2025
Introducing PrimeKGQA - a scalable approach to dataset generation, harnessing the power of large language models.

Human-AI collaboration in physical tasks

  03 Jan 2025
Creating an intelligent assistant that uses the sensors in a smartwatch to support physical tasks such as cooking and DIY.

Improving calibration by relating focal loss, temperature scaling, and properness

  28 Nov 2024
In ML classification tasks, achieving high accuracy is only part of the goal; it's equally important for models to express how confident they are in their predictions.

AI in cancer research & care: perspectives of three KU Leuven institutes

This story is a collaboration of three Institutes that are working at the intersection of cell research, cancer research and care, and artificial intelligence.

Identification of hazardous areas for priority landmine clearance: AI for humanitarian mine action

  19 Nov 2024
In close collaboration with the UN and local NGOs, we co-develop an interpretable predictive tool to identify hazardous clusters of landmines.

Enhancing controlled query evaluation through epistemic policies

The winners of an IJCAI2024 best paper award explain the key advances of their work.

VQAScore: Evaluating and improving vision-language generative models

  06 Nov 2024
We introduce a new evaluation metric and benchmark dataset for automated evaluation of text-to-visual generative models.

No free lunch in LLM watermarking: Trade-offs in watermarking design choices

  23 Oct 2024
Common design choices in LLM watermarking schemes make the resulting systems surprisingly susceptible to watermark removal or spoofing attacks.

Linguistic bias in ChatGPT: Language models reinforce dialect discrimination

  30 Sep 2024
Examining how ChatGPT’s behavior changes in response to text in different varieties of English.

Rethinking LLM memorization

  23 Sep 2024
Our approach provides a simple and practical perspective on what memorization can mean, providing a useful tool for functional and legal analysis of LLMs.

How to evaluate jailbreak methods: a case study with the StrongREJECT benchmark

  09 Sep 2024
Providing a more accurate assessment of jailbreak effectiveness.

Proportional aggregation of preferences for sequential decision making

and   27 Aug 2024
Read about work that won an outstanding paper award at AAAI 2024.

CMU-MATH team’s innovative approach secures 2nd place at the AIMO prize

  19 Aug 2024
Dive into our blog to discover the winning formula.

Are we ready for multi-image reasoning? Launching VHs: The Visual Haystacks benchmark!

  01 Aug 2024
This project focuses on the “Multi-Image Question Answering” (MIQA) task.

How to regularize your regression

  17 Jun 2024
Considering how to tune the norm-based regularization parameters in linear regression.

Beyond the mud: Datasets, benchmarks, and methods for computer vision in off-road racing

  17 Apr 2024
Off-road motorcycle racing poses unique challenges that push the boundaries of what existing computer vision systems can handle

Modeling extremely large images with xT

  08 Apr 2024
Introducing a new framework to model large images on contemporary GPUs while aggregating global context with local details.

The shift from models to compound AI systems

  15 Mar 2024
State-of-the-art AI results are increasingly obtained by compound systems with multiple components, not just monolithic models.

Unlocking the potential of entity-centric knowledge graphs: transforming healthcare and beyond

The concept of entity-centric knowledge graphs holds promise in reshaping how we organize, access, and leverage data.

On noisy evaluation in federated hyperparameter tuning

  12 Jan 2024
Our work explores key sources of noise and shows that even small amounts of noise can have a significant impact on tuning methods.

Asymmetric certified robustness via feature-convex neural networks

  14 Dec 2023
We propose the asymmetric certified robustness problem, which requires certified robustness for only one class and reflects real-world adversarial scenarios.

Goal representations for instruction following

  23 Nov 2023
How can we reconcile the ease of specifying tasks through natural language-based approaches with the performance improvements of goal-conditioned learning?

A comprehensive survey on rare event prediction

We review the rare event prediction literature and highlight open research questions and future directions in the field.

Test-time adaptation with slot-centric models

  25 Sep 2023
Improving out-of-distribution scene decomposition accuracy.

Training diffusion models with reinforcement learning

  15 Sep 2023
We show how diffusion models can be trained on downstream objectives directly using reinforcement learning.

A ‘black box’ AI system has been influencing criminal justice decisions for over two decades – it’s time to open it up

  11 Aug 2023
Melissa Hamilton and Pamela Ugwudike investigate the use of automated decision-making systems in courts and prisons.

On the stepwise nature of self-supervised learning

  01 Aug 2023
Presenting a mathematical picture of the training process of large-scale SSL methods.

Navigating to objects in the real world

  24 Jul 2023
Research shows that modular learning is a reliable approach to navigate to objects.

Generating 3D molecular conformers via equivariant coarse-graining and aggregated attention

  14 Jul 2023
Introducing a variational encoder for molecular conformer generation.

GPT-4 + Stable-Diffusion = ?: Enhancing prompt understanding of text-to-image diffusion models with large language models

  26 Jun 2023
Our LLM-grounded model delivers improved prompt understanding in cases including negation, numeracy, and spatial relationships.

On privacy and personalization in federated learning: a retrospective on the US/UK PETs challenge

  05 Jun 2023
Studying the use of differential privacy in personalized, cross-silo federated learning.

TIDEE: An embodied agent that tidies up novel rooms using commonsense priors

  28 Apr 2023
We introduce a new benchmark to test agents in their ability to clean up messy scenes without any human instruction.

Koala: A dialogue model for academic research

  18 Apr 2023
In this post, we introduce Koala, a chatbot trained by fine-tuning Meta’s LLaMA on dialogue data gathered from the web.

Are model explanations useful in practice? Rethinking how to support human-ML interactions

  14 Apr 2023
This post describes a workflow for evaluating XAI methods, how this workflow was instantiated in two domains, and insights from these efforts.

Methods for addressing class imbalance in deep learning-based natural language processing

and   30 Mar 2023
This blogpost gives an overview of class imbalance in NLP and surveys methods for addressing this.






AIhub is supported by:






©2024 - Association for the Understanding of Artificial Intelligence


 












©2021 - ROBOTS Association