ΑΙhub.org

deep dive


Identification of hazardous areas for priority landmine clearance: AI for humanitarian mine action

In close collaboration with the UN and local NGOs, we co-develop an interpretable predictive tool to identify hazardous clusters of landmines.
19 November 2024, by

Enhancing controlled query evaluation through epistemic policies

The winners of an IJCAI2024 best paper award explain the key advances of their work.

VQAScore: Evaluating and improving vision-language generative models

We introduce a new evaluation metric and benchmark dataset for automated evaluation of text-to-visual generative models.
06 November 2024, by

No free lunch in LLM watermarking: Trade-offs in watermarking design choices

Common design choices in LLM watermarking schemes make the resulting systems surprisingly susceptible to watermark removal or spoofing attacks.
23 October 2024, by

Linguistic bias in ChatGPT: Language models reinforce dialect discrimination

Examining how ChatGPT’s behavior changes in response to text in different varieties of English.
30 September 2024, by

Rethinking LLM memorization

Our approach provides a simple and practical perspective on what memorization can mean, providing a useful tool for functional and legal analysis of LLMs.
23 September 2024, by



How to evaluate jailbreak methods: a case study with the StrongREJECT benchmark

Providing a more accurate assessment of jailbreak effectiveness.
09 September 2024, by

Proportional aggregation of preferences for sequential decision making

Read about work that won an outstanding paper award at AAAI 2024.
27 August 2024, by and

CMU-MATH team’s innovative approach secures 2nd place at the AIMO prize

Dive into our blog to discover the winning formula.
19 August 2024, by

Are we ready for multi-image reasoning? Launching VHs: The Visual Haystacks benchmark!

This project focuses on the “Multi-Image Question Answering” (MIQA) task.
01 August 2024, by

How to regularize your regression

Considering how to tune the norm-based regularization parameters in linear regression.
17 June 2024, by

Beyond the mud: Datasets, benchmarks, and methods for computer vision in off-road racing

Off-road motorcycle racing poses unique challenges that push the boundaries of what existing computer vision systems can handle
17 April 2024, by

Modeling extremely large images with xT

Introducing a new framework to model large images on contemporary GPUs while aggregating global context with local details.
08 April 2024, by

The shift from models to compound AI systems

State-of-the-art AI results are increasingly obtained by compound systems with multiple components, not just monolithic models.
15 March 2024, by

Unlocking the potential of entity-centric knowledge graphs: transforming healthcare and beyond

The concept of entity-centric knowledge graphs holds promise in reshaping how we organize, access, and leverage data.
27 February 2024, by and

On noisy evaluation in federated hyperparameter tuning

Our work explores key sources of noise and shows that even small amounts of noise can have a significant impact on tuning methods.
12 January 2024, by

Asymmetric certified robustness via feature-convex neural networks

We propose the asymmetric certified robustness problem, which requires certified robustness for only one class and reflects real-world adversarial scenarios.
14 December 2023, by

Goal representations for instruction following

How can we reconcile the ease of specifying tasks through natural language-based approaches with the performance improvements of goal-conditioned learning?
23 November 2023, by

A comprehensive survey on rare event prediction

We review the rare event prediction literature and highlight open research questions and future directions in the field.

Test-time adaptation with slot-centric models

Improving out-of-distribution scene decomposition accuracy.
25 September 2023, by

Training diffusion models with reinforcement learning

We show how diffusion models can be trained on downstream objectives directly using reinforcement learning.
15 September 2023, by

A ‘black box’ AI system has been influencing criminal justice decisions for over two decades – it’s time to open it up

Melissa Hamilton and Pamela Ugwudike investigate the use of automated decision-making systems in courts and prisons.
11 August 2023, by

On the stepwise nature of self-supervised learning

Presenting a mathematical picture of the training process of large-scale SSL methods.
01 August 2023, by

Navigating to objects in the real world

Research shows that modular learning is a reliable approach to navigate to objects.
24 July 2023, by

Generating 3D molecular conformers via equivariant coarse-graining and aggregated attention

Introducing a variational encoder for molecular conformer generation.
14 July 2023, by

GPT-4 + Stable-Diffusion = ?: Enhancing prompt understanding of text-to-image diffusion models with large language models

Our LLM-grounded model delivers improved prompt understanding in cases including negation, numeracy, and spatial relationships.
26 June 2023, by

On privacy and personalization in federated learning: a retrospective on the US/UK PETs challenge

Studying the use of differential privacy in personalized, cross-silo federated learning.
05 June 2023, by

TIDEE: An embodied agent that tidies up novel rooms using commonsense priors

We introduce a new benchmark to test agents in their ability to clean up messy scenes without any human instruction.
28 April 2023, by

Koala: A dialogue model for academic research

In this post, we introduce Koala, a chatbot trained by fine-tuning Meta’s LLaMA on dialogue data gathered from the web.
18 April 2023, by

Are model explanations useful in practice? Rethinking how to support human-ML interactions

This post describes a workflow for evaluating XAI methods, how this workflow was instantiated in two domains, and insights from these efforts.
14 April 2023, by






AIhub is supported by:






©2024 - Association for the Understanding of Artificial Intelligence


 












©2021 - ROBOTS Association