Transformer Language Model Examples

Google DeepMind Introduces ATLAS Scaling Laws for Multilingual Language Models

Google DeepMind researchers have introduced ATLAS, a set of scaling laws for multilingual language models that formalize how model size, training data volume, and language mixtures interact as the ...

Journal of Medical Internet Research

Applications of Federated Large Language Model for Adverse Drug Reactions Prediction: Scoping Review

Additionally, client models trained on the edge device can be merged into a global model on the server, preserving data privacy. Results: Natural Language Processing (NLP) technologies underpinning ...

Model Showcase 2: New Architectural Approaches in Language Models

This column focuses on open-weight models from China, Liquid Foundation Models, performant lean models, and a Titan from ...

RCR Wireless News

China Telecom built its own AI models with home-grown hardware

TeleChat3 series – China Telecom’s TeleAI released the first large-scale Mixture-of-Experts (MoE) models trained entirely on ...

Scientific Research Publishing

Geo-Refined Point Transformer: Coordinate-Aware Excitation and Positional Upsampling for 3D Scene Segmentation ()

The proposed Coordinate-Aware Feature Excitation (CAFE) module and Position-Aware Upsampling (Pos-Up) module both adhere to ...

11d

AGI? GPUs? Learn the definitions of the most common AI terms to enter our vocabulary

The rise of AI has given us an entirely new vocabulary. Here's a list of the top AI terms you need to learn, in alphabetical order.

IEEE

TAFF-YOLO: An Enhanced YOLO Model With Transformer and Attention Feature Fusion for Chip Internal Defect Detection

Abstract: Accurate and efficient detection of internal defects in chips is crucial for ensuring the reliability and yield of electronic products. However, conventional object detection models often ...

GitHub

Docker Model CLI

This repository has been consolidated into model-runner. All future development, issues, and pull requests should be directed there. Please visit the new repository for the latest updates and to ...

IEEE

Late Breaking Results: Dynamically Scalable Pruning for Transformer-Based Large Language Models

Abstract: We propose Matryoshka, a novel framework for transformer model pruning, enabling dynamic runtime controls while maintaining competitive accuracy to modern large language models (LLMs).

Some results have been hidden because they may be inaccessible to you

Show inaccessible results