In July, Lockheed Martin completed the build of NASA’s X-59 test aircraft, which is designed to turn sonic booms into mere thumps, in the hope of making overland supersonic flight a possibility. Ground tests and a first test flight are planned for later in the year. NASA aims to have enough data to hand over to US regulators in 2027.

1686

374

The German Rhineland-Palatinate State Parliament has ditched X (Twitter) in favour of open-source decentralised Mastodon (gadgeteer.co.za)

submitted 1 year ago by cyu@sh.itjust.works to c/technology@lemmy.ml

12 comments fedilink

1687

Beating GPT-4 on HumanEval with a Fine-Tuned CodeLlama-34B (lemmy.world)

submitted 1 year ago by Blaed@lemmy.world to c/technology@lemmy.ml

2 comments fedilink

cross-posted from: https://lemmy.world/post/3879861

Beating GPT-4 on HumanEval with a Fine-Tuned CodeLlama-34B

Hello everyone! This post marks an exciting moment for !fosai@lemmy.world and everyone in the open-source large language model and AI community.

We appear to have a new contender on the block, a model apparently capable of surpassing OpenAI's state of the art ChatGPT-4 in coding evals (evaluations).

This is huge. Not too long ago I made an offhand comment on us catching up to GPT-4 within a year. I did not expect that prediction to end up being reality in half the time. Let's hope this isn't a one-off scenario and that we see a new wave of open-source models that begin to challenge OpenAI.

Buckle up, it's going to get interesting!

Here's some notes from the blog, which you should visit and read in its entirety:

https://www.phind.com/blog/code-llama-beats-gpt4

Blog Post

We have fine-tuned CodeLlama-34B and CodeLlama-34B-Python on an internal Phind dataset that achieved 67.6% and 69.5% pass@1 on HumanEval, respectively. GPT-4 achieved 67% according to their official technical report in March. To ensure result validity, we applied OpenAI's decontamination methodology to our dataset.

The CodeLlama models released yesterday demonstrate impressive performance on HumanEval.

CodeLlama-34B achieved 48.8% pass@1 on HumanEval

CodeLlama-34B-Python achieved 53.7% pass@1 on HumanEval

We have fine-tuned both models on a proprietary dataset of ~80k high-quality programming problems and solutions. Instead of code completion examples, this dataset features instruction-answer pairs, setting it apart structurally from HumanEval. We trained the Phind models over two epochs, for a total of ~160k examples. LoRA was not used — both models underwent a native fine-tuning. We employed DeepSpeed ZeRO 3 and Flash Attention 2 to train these models in three hours using 32 A100-80GB GPUs, with a sequence length of 4096 tokens.

Furthermore, we applied OpenAI's decontamination methodology to our dataset to ensure valid results, and found no contaminated examples.

The methodology is:

For each evaluation example, we randomly sampled three substrings of 50 characters or used the entire example if it was fewer than 50 characters.

A match was identified if any sampled substring was a substring of the processed training example.

For further insights on the decontamination methodology, please refer to Appendix C of OpenAI's technical report. Presented below are the pass@1 scores we achieved with our fine-tuned models:

Phind-CodeLlama-34B-v1 achieved 67.6% pass@1 on HumanEval

Phind-CodeLlama-34B-Python-v1 achieved 69.5% pass@1 on HumanEval

Download

We are releasing both models on Huggingface for verifiability and to bolster the open-source community. We welcome independent verification of results.

https://huggingface.co/Phind/Phind-CodeLlama-34B-v1

https://huggingface.co/Phind/Phind-CodeLlama-34B-Python-v1

If you get a chance to try either of these models out, let us know how it goes in the comments below!

If you found anything about this post interesting, consider subscribing to !fosai@lemmy.world.

Cheers to the power of open-source! May we continue the fight for optimization, efficiency, and performance.

1688

-11

@technology (mastodon.social)

submitted 1 year ago by Narayoni@mastodon.social to c/technology@lemmy.ml

0 comments fedilink

@technology
The EU has just clamped down on big tech. Britain, take note https://mastodon.scot/@DrHannahGraham/110956248305740892

1689

Dropbox Axes Unlimited Cloud Storage for Businesses (blog.dropbox.com)

submitted 1 year ago by BrikoX@lemmy.zip to c/technology@lemmy.ml

24 comments fedilink

1690

280

Windows feature that resets system clocks based on random data is wreaking havoc (arstechnica.com)

submitted 1 year ago by const_void@lemmy.ml to c/technology@lemmy.ml

28 comments fedilink

1691

Huawei reportedly building 'secret' semiconductor fabs (www.theregister.com)

submitted 1 year ago by BrikoX@lemmy.zip to c/technology@lemmy.ml

5 comments fedilink

1692

-3

Pyramid Schemes Are Illegal. MLMs Are Not. What About the Tech That Powers Them? (themarkup.org)

submitted 1 year ago by 111@zerobytes.monster to c/technology@lemmy.ml

2 comments fedilink

1693

95% of senior adults report robot helpful in reducing loneliness (www.youtube.com)

submitted 1 year ago by cyu@sh.itjust.works to c/technology@lemmy.ml

17 comments fedilink

1694

158

In a historic about-face, Apple publicly supports right-to-repair bill (grist.org)

submitted 1 year ago by 111@zerobytes.monster to c/technology@lemmy.ml

40 comments fedilink

1695

Cyberspatial Feudalism: Understanding Twitter's Rebrand, Worldcoin, and the Rest of Tech (theluddite.org)

submitted 1 year ago by theluddite@lemmy.ml to c/technology@lemmy.ml

1 comments fedilink

1696

245

Hosting firm says it lost all customer data after ransomware attack (www.bleepingcomputer.com)

submitted 1 year ago by floofloof@lemmy.ca to c/technology@lemmy.ml

40 comments fedilink

1697

772

The Internet Is About to Get a Lot Worse (US focused) (buttondown.email)

submitted 1 year ago by thenexusofprivacy@lemmy.sdf.org to c/technology@lemmy.ml

260 comments fedilink

Charlie Jane Anders discusses KOSA (the Kids Online Safety Act).

If you're in the US, https://www.stopkosa.com/ makes it easy to contact your Senators and ask them to oppose KOSA.

"A new bill called the Kids Online Safety Act, or KOSA, is sailing towards passage in the Senate with bipartisa>n support. Among other things, this bill would give the attorney general of every state, including red states, the right to sue Internet platforms if they allow any content that is deemed harmful to minors. This clause is so vaguely defined that attorneys general can absolutely claim that queer content violates it — and they don't even need to win these lawsuits in order to prevail. They might not even need to file a lawsuit, in fact. The mere threat of an expensive, grueling legal battle will be enough to make almost every Internet platform begin to scrub anything related to queer people.

The right wing Heritage Foundation has already stated publicly that the GOP will use this provision to remove any discussions of trans or queer lives from the Internet. They're salivating over the prospect.

And yep, I did say this bill has bipartisan support. Many Democrats have already signed on as co-sponsors. And President Joe Biden has urged lawmakers to pass this bill in the strongest possible terms."

1698

Brain-reading progress on re-enabling speech after paralysis (mastodon.ie)

submitted 1 year ago by cyu@sh.itjust.works to c/technology@lemmy.ml

0 comments fedilink

1699

Google's Dysfunctional AR Division Plans Apple Vision Pro Clone With Samsung (tech.slashdot.org)

submitted 1 year ago by csfirecracker@lemmyf.uk to c/technology@lemmy.ml

19 comments fedilink

1700

132

API Misuse: Hacker Leaks 2.6M Duolingo Users' Emails & Names (www.hackread.com)

submitted 1 year ago by 111@zerobytes.monster to c/technology@lemmy.ml

11 comments fedilink

Technology

34728 readers

116 users here now

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.

Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.

Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

founded 5 years ago

MODERATORS

MinutePhrase@lemmy.ml