view the rest of the comments
politics
Welcome to the discussion of US Politics!
Rules:
- Post only links to articles, Title must fairly describe link contents. If your title differs from the site’s, it should only be to add context or be more descriptive. Do not post entire articles in the body or in the comments.
Links must be to the original source, not an aggregator like Google Amp, MSN, or Yahoo.
Example:
- Articles must be relevant to politics. Links must be to quality and original content. Articles should be worth reading. Clickbait, stub articles, and rehosted or stolen content are not allowed. Check your source for Reliability and Bias here.
- Be civil, No violations of TOS. It’s OK to say the subject of an article is behaving like a (pejorative, pejorative). It’s NOT OK to say another USER is (pejorative). Strong language is fine, just not directed at other members. Engage in good-faith and with respect! This includes accusing another user of being a bot or paid actor. Trolling is uncivil and is grounds for removal and/or a community ban.
- No memes, trolling, or low-effort comments. Reposts, misinformation, off-topic, trolling, or offensive. Similarly, if you see posts along these lines, do not engage. Report them, block them, and live a happier life than they do. We see too many slapfights that boil down to "Mom! He's bugging me!" and "I'm not touching you!" Going forward, slapfights will result in removed comments and temp bans to cool off.
- Vote based on comment quality, not agreement. This community aims to foster discussion; please reward people for putting effort into articulating their viewpoint, even if you disagree with it.
- No hate speech, slurs, celebrating death, advocating violence, or abusive language. This will result in a ban. Usernames containing racist, or inappropriate slurs will be banned without warning
We ask that the users report any comment or post that violate the rules, to use critical thinking when reading, posting or commenting. Users that post off-topic spam, advocate violence, have multiple comments or posts removed, weaponize reports or violate the code of conduct will be banned.
All posts and comments will be reviewed on a case-by-case basis. This means that some content that violates the rules may be allowed, while other content that does not violate the rules may be removed. The moderators retain the right to remove any content and ban users.
That's all the rules!
Civic Links
• Congressional Awards Program
• Library of Congress Legislative Resources
• U.S. House of Representatives
Partnered Communities:
• News
I mean you can pretty simply just engineer around that. Dumping 5k pages is obviously an idiotic way of approaching the issue. But having an LLM going through 500 words at a time, with 125 words of overlap in each sequence to pull out key words, phrases, and intentions, then put that into a structured data form like a JSON. Then parse the JSONs to pick up on regions where specific sets of phrases and words occur. Give those sections in part or entirely to the LLM again; again have it give you structured output. Further parse and repeat. Do all of these actions several times to get a probability distribution of each assumption around what is being said or is intended. Build the results into a Bayes net, or however you like, to get at the most likely summaries of what the document is saying. These results can then be manually reviewed. If you are touchy, you can even adjust the sensitivity to pick up on much more nuanced reads of the text.
Like, if the limit of your imagination is throwing spaghetti against a wall, obviously your results are going to turn out like shit. But with a bit of hand holding, some structure and engineering, LLM's can be made to substantially outperform their (average) human counter parts. They do already. Use them in a more probabilistic way to create distributions around the assumptions they make, and you can set up a system which will vastly outperform what an individual human can do.
(just asked up the thread:)
GPT-4 & Claude 3 Opus have made little summarization oopsies for me this past week. You’d trust ‘em in such a high profile case?
if you end them 100 times over the same text?
This is court, not a school project or academia. But in general I agree with you.
No this is discovery and we're discussing how you would engineer a system to support automating it.
LLMs are still pretty limited, but I would agree with you that if there was a single task at which they can excel, it's translating and summarizing. They also have much bigger contexts than 500 words. I think ChatGPT has a 32k token context which is certainly enough to summarize entire chapters at a time.
You'd definitely need to review the result by hand, but AI could suggest certain key things to look for.
People were doing this somewhat effectively with garbage Markov chains and it was 'ok'. There is research going on right now doing precisely what I described. I know because I wrote a demo for the researcher whose team wanted to do this, and we're not even using fine tuned LLMs. You can overcome much of the issues around 'hallucinations' by just repeating the same thing several times to get to a probability. There are teams funded in the hundreds of millions to build the engineering around these things. Wrap calls in enough engineering and get the bumper rails into place and the current generation of LLM's are completely capable of what I described.
This current generation of AI revolution is just getting started. We're in the 'deep blue' phase where people are shocked that an AI can even do the thing as good or better than humans. We'll be at alpha-go in a few years, and we simply won't recognize the world we live in. In a decade, it will be the AI as the authority and people will be questioning allowing humans to do certain things.
Read a little further. I might disagree with you about the overall capability/potential of AI, but I agree this is a great task to highlight its strengths.
Sure. and yes I think we largely agree, but on the differences, I seen that they can effectively be overcome by making the same call repeatedly and looking at the distribution of results. Its probably not as good as just having a better underlying model, but even then the same approach might be necessary.