Machine Learning - Learning/Language Models
GitHub: https://github.com/mistralai-sf24/hackathon \ X: https://twitter.com/MistralAILabs/status/1771670765521281370 >New release: Mistral 7B v0.2 Base (Raw pretrained model used to train Mistral-7B-Instruct-v0.2)\ >🔸 https://models.mistralcdn.com/mistral-7b-v0-2/mistral-7B-v0.2.tar \ >🔸 32k context window\ >🔸 Rope Theta = 1e6\ >🔸 No sliding window\ >🔸 How to fine-tune:
arXiv: https://arxiv.org/abs/2403.13187 \[cs.NE\]\ GitHub: https://github.com/SakanaAI/evolutionary-model-merge
Previous Lemmy.ml post: https://lemmy.ml/post/1015476 Original X post (at Nitter): https://nitter.net/xwang_lk/status/1734356472606130646
**Abstract** We evaluate the use of the open-source Llama-2 model for generating well-known, high-performance computing kernels (e.g., AXPY, GEMV, GEMM) on different parallel programming models and languages (e.g., C++: OpenMP, OpenMP Offload, OpenACC, CUDA, HIP; Fortran: OpenMP, OpenMP Offload, OpenACC; Python: numpy, Numba, pyCUDA, cuPy; and Julia: Threads, CUDA.jl, AMDGPU.jl). We built upon our previous work that is based on the OpenAI Codex, which is a descendant of GPT-3, to generate similar kernels with simple prompts via GitHub Copilot. Our goal is to compare the accuracy of Llama-2 and our original GPT-3 baseline by using a similar metric. Llama-2 has a simplified model that shows competitive or even superior accuracy. We also report on the differences between these foundational large language models as generative AI continues to redefine human-computer interactions. Overall, Copilot generates codes that are more reliable but less optimized, whereas codes generated by Llama-2 are less reliable but more optimized when correct.
Original (pay-walled): https://www.nytimes.com/2023/09/25/technology/chatgpt-rlhf-human-tutors.html
Original (pay-walled): https://www.wsj.com/tech/ai/meta-is-developing-a-new-more-powerful-ai-system-as-technology-race-escalates-decf9451
Corresponding arXiv preprint: https://arxiv.org/abs/2308.03762
Machine Learning - Learning/Language Models
!models@lemmy.intai.techDiscussion of models, thier use, setup and options.
Please include models used with your outputs, workflows optional.
We follow Lemmy’s code of conduct.
Communities
- News and Events
- Ethics, Law, Philsophy
- ML Research
- NLP/Prompting
- Projects #buildinpublic
- Jailbreaks and Security
- OffTopic