[-] Deliverator@kbin.social 1 points 1 year ago

I really like kbins layout/structure, it's like a mix of reddit and twitter and with some optimization could really be something special.

[-] Deliverator@kbin.social 2 points 1 year ago

That old George Carlin bit is more relevant than ever:
https://www.youtube.com/watch?v=FZq6MfGKpQ0

[-] Deliverator@kbin.social 1 points 1 year ago* (last edited 1 year ago)

This is a good example, it's a hydrophone recording of a glass sphere imploding, the level of sound and echo should give you a good idea of the kind of forces we're dealing with:

https://www.youtube.com/watch?v=1_qlQhBa5V4

0

Modern Bayesian inference involves a mixture of computational techniques for estimating, validating, and drawing conclusions from probabilistic models as part of principled workflows for data analysis. Typical problems in Bayesian workflows are the approximation of intractable posterior distributions for diverse model types and the comparison of competing models of the same process in terms of their complexity and predictive performance. This manuscript introduces the Python library BayesFlow for simulation-based training of established neural network architectures for amortized data compression and inference. Amortized Bayesian inference, as implemented in BayesFlow, enables users to train custom neural networks on model simulations and re-use these networks for any subsequent application of the models. Since the trained networks can perform inference almost instantaneously, the upfront neural network training is quickly amortized.

[-] Deliverator@kbin.social 16 points 1 year ago

Frankly I think we need more people before we can start getting concerned about things like that. If we're trying to make the Fediverse a viable alternative it has to be appealing and easy enough to use that people want to use it. If we don't get that right this whole thing is doomed from the start

[-] Deliverator@kbin.social 0 points 1 year ago

I vote the left one or some variant of it

[-] Deliverator@kbin.social 11 points 1 year ago

Its that good old American puritanical spirit at work

[-] Deliverator@kbin.social 12 points 1 year ago

"Give Me Convenience or Give Me Death"

[-] Deliverator@kbin.social 7 points 1 year ago

The live page psuedo app is working well for me on grapheneOS with the vanadium browser, I do also get a random 503 error but it's already so much better than a couple days ago

1

Stable Diffusion revolutionised image creation from descriptive text. GPT-2, GPT-3(.5) and GPT-4 demonstrated astonishing performance across a variety of language tasks. ChatGPT introduced such language models to the general public. It is now clear that large language models (LLMs) are here to stay, and will bring about drastic change in the whole ecosystem of online text and images. In this paper we consider what the future might hold. What will happen to GPT-{n} once LLMs contribute much of the language found online? We find that use of model-generated content in training causes irreversible defects in the resulting models, where tails of the original content distribution disappear. We refer to this effect as Model Collapse and show that it can occur in Variational Autoencoders, Gaussian Mixture Models and LLMs. We build theoretical intuition behind the phenomenon and portray its ubiquity amongst all learned generative models. We demonstrate that it has to be taken seriously if we are to sustain the benefits of training from large-scale data scraped from the web. Indeed, the value of data collected about genuine human interactions with systems will be increasingly valuable in the presence of content generated by LLMs in data crawled from the Internet.

[-] Deliverator@kbin.social 20 points 1 year ago* (last edited 1 year ago)

It would help alot If telecom/internet infrastructure was treated like our other infrastructure. Not to mention the literal billions of dollars in fraud that companies like Verizon and Comcast get away with. I still get mad when I think about how they were given massive sums of money to expand fiber optic infrastructure and gave themselves bonuses instead.

[-] Deliverator@kbin.social 0 points 1 year ago

I haven't been on reddit in a few days but before I left I recall seeing someone post a github link that touted something like that for lemmy instances. Haven't found anything for kbin but I know there are tools out there for scraping data/images/etc from different subreddits and users

[-] Deliverator@kbin.social 0 points 1 year ago

One thing that helped me get a handle on things was creating my own magazine and just playing around with it. Magazines = subreddits, and articles = threads. An article is meant to be more text based, and you can also select the option to upload pictures/links to the magazine instead. I haven't messed with the whole microblog section but if you select "add post" it will upload to the microblog of the selected magazine instead of the thread list. Boosts are different from likes in that they boost your personal 'reputation' instead of ranking thread popularity.

Overall I really like the level of customization kbin offers, definitely needs some UI improvements but it kinda feels like a hyperpowered version of reddit once you get used to it.

21

MOOCs

Nowadays, there are a couple of really excellent online lectures to get you started. The list is too long to include them all. Every one of the major MOOC sites offers not only one but several good Machine Learning classes, so please check coursera, edX, Udacity yourself to see which ones are interesting to you.

However, there are a few that stand out, either because they're very popular or are done by people who are famous for their work in ML. Roughly in order from easiest to hardest, those are:

Books

The most often recommended textbooks on general Machine Learning are (in no particular order):

Note that these books delve deep into math, and might be a bit heavy for complete beginners. If you don't care so much about derivations or how exactly the methods work but would rather just apply them, then the following are good practical intros:

There are of course a whole plethora on books that only cover specific subjects, as well as many books about surrounding fields in Math. A very good list has been collected by /u/ilsunil here

Deep Learning Resources

Math Resources

Programming Languages and Software

In general, the most used languages in ML are probably Python, R and Matlab (with the latter losing more and more ground to the former two). Which one suits you better depends wholy on your personal taste. For R, a lot of functionality is either already in the standard library or can be found through various packages in CRAN. For Python, NumPy/SciPy are a must. From there, Scikit-Learn covers a broad range of ML methods.

If you just want to play around a bit and don't do much programming yourself then things like Visions of Chaos, WEKA, KNIME or RapidMiner might be of your liking. Word of caution: a lot of people in this subreddit are very critical of WEKA, so even though it's listed here, it is probably not a good tool to do anything more than just playing around a bit. A more detailed discussion can be found here

Deep Learning Software, GPU's and Examples

There are a number of modern deep learning toolkits you can utilize to implement your models. Below, you will find some of the more popular toolkits. This is by no means an exhaustive list. Generally speaking, you should utilize whatever GPU has the most memory, highest clock speed, and most CUDA cores available to you. This was the NVIDIA Titan X from the previous generation. These frameworks are all very close in computation speed, so you should choose the one you prefer in terms of syntax.

Theano is a python based deep learning toolkit developed by the Montreal Institute of Learning Algorithms, a cutting edge deep learning academic research center and home of many users of this forum. This has a large number of tutorials ranging from beginner to cutting edge research.

Torch is a Luajit based scientific computing framework developed by Facebook Artificial Intelligence Research (FAIR) and is also in use at Twitter Cortex. There is the torch blog which contains examples of the torch framework in action.

TensorFlow is a python deep learning framework developed by Google Brain and in use at Google Brain and Deepmind. The newest framework around. Some TensorFlow examples may be found here Do not ask questions on the Google Groups, ask them on stackoverflow

Neon is a python based deep learning framework built around a custom and highly performant CUDA compiler Maxas by NervanaSys.

Caffe is an easy to use, beginner friendly deep learning framework. It provides many pretrained models and is built around a protobuf format of implementing neural networks.

Keras can be used to wrap Theano or TensorFlow for ease of use.

Datasets and Challenges for Beginners

There are a lot of good datasets here to try out your new Machine Learning skills.

Research Oriented Datasets

In many papers, you will find a few datasets are the most common. Below, you can find the links to some of them.

Communities

ML Research

Machine Learning is a very active field of research. The two most prominent conferences are without a doubt NIPS and ICML. Both sites contain the pdf-version of the papers accepted there, they're a great way to catch up on the most up-to-date research in the field. Other very good conferences include UAI (general AI), COLT (covers theoretical aspects) and AISTATS.

Good journals for ML papers are the Journal of Machine Learning Research, the Journal of Machine Learning and arxiv.

Other sites and Tutorials

FAQ

  • How much Math/Stats should I know?

That depends on how deep you want to go. For a first exposure (e.g. Ng's Coursera class) you won't need much math, but in order to understand how the methods really work,having at least an undergrad level of Statistics, Linear Algebra and Optimization won't hurt.

view more: next ›

Deliverator

joined 1 year ago