Arxiv Papers - [short] HiRE: High Recall Approximate Top-k Estimation for Efficient LLM Inference
Sign in to continue reading, translating and more.