![]() |
|
![]() |
| So, I've been reading Google research papers for decades now and also worked there for a decade and wrote a few papers of my own.
When google publishes papers, they tend to juice the results significance (google is not the only group that does this, but they are pretty egregious). You need to be skilled in the field of the paper to be able to pare away the exceptional claims. A really good example is https://spectrum.ieee.org/chip-design-controversy while I think Google did some interesting work there and it's true they included some of the results in their chip designs, their comparison claims are definitely over-hyped and they did not react well when they got called out on it. |
![]() |
| Remember Google is a publicly traded company, so everything must be reviewed to "ensure shareholder value". Like dekhn said, its impressive, but marketing wants more than "impressive". |
![]() |
| This is true for public universities and private universities; you see the same thing happening in academic papers (and especially the university PR around the paper) |
![]() |
| The actual papers don't overhype. But the university PR's regarding those papers? They can really overhype the results. And of course, the media then takes it up an extra order of magnitude. |
![]() |
| Depends on what you call "overhype".
Wishful mnemonics in the field was called out by Drew McDermott in the mid 1970's and it is still a problem today. https://www.inf.ed.ac.uk/teaching/courses/irm/mcdermott.pdf And: > As a field, I believe that we tend to suffer from what might be called serial silver bulletism, defined as follows: the tendency to believe in a silver bullet for AI, coupled with the belief that previous beliefs about silver bullets were hopelessly naive. (H. J. Levesque. On our best behaviour. Artificial Intelligence, 212:27–35, 2014.) |
![]() |
| > Google's problem that nobody else has a million cores, wouldn't you agree
On the contrary - their advantage. They know it and they can make outlandish claims that no one will disprove |
![]() |
| It sounds like you're suggesting that we need machines that mass produce things like automated pipetting machines and the robots that glue those sorts of machines together. |
![]() |
| I've built microscopes intended to be installed inside workcells similar to what companies like Transcriptic built (https://www.transcriptic.com/). So my scope could be automated by the workcell automation components (robot arms, motors, conveyors, etc).
When I demo'd my scope (which is similar to a 3d printer, using low-cost steppers and other hobbyist-grade components) the CEO gave me feedback which was very educational. They couldn't build a system that used my style of components because a failure due to a component would bring the whole system down and require an expensive service call (along with expensive downtime for the user). Instead, their mech engineer would select extremely high quality components that had a very low probability of failure to minimize service calls and other expensive outages. Unfortunately, the cost curve for reliability not pretty, to reduce mechanical failures to close to zero costs close to infinity dollars. One of the reasons Google's book scanning was so scalable was their choice to build fairly simple, cheap, easy to maintain machines, and then build a lot of them, and train the scanning individuals to work with those machines quirks. Just like their clusters, they tolerate a much higher failure rate and build all sorts of engineering solutions where other groups would just buy 1 expensive device with a service contract. |
![]() |
| @dekhn it is true (I also work in the field. I'm a software engineer who got a wet-lab PhD in biochemistry and work at a biotech doing oncology drug discovery) |
![]() |
| Yeah that's not how anything works. Compounds are approved for use or not based on empirical evidence, thus the need for clinical trials. What's your level of exposure to the pharma industry? |
![]() |
| > Compounds are approved for use or not based on empirical evidence, thus the need for clinical trials.
But off-label use is legal, so it's ok to use a drug that's safe but not proven effective (to the FDA's high standards) for that ailment... but only if it's been proven effective for some other random ailment. That makes no sense. > What's your level of exposure to the pharma industry? Just an interested outsider who read e.g. the Omegaven story on https://www.astralcodexten.com/p/adumbrations-of-aducanumab . |
![]() |
| People are facing existential dread that the knowledge they worked years for is possibly about to become worth a $20 monthly subscription. People will downplay it for years no matter what. |
![]() |
| Alternatively, human brains are just terrible at "high IQ symbol manipulation" and that's a much easier cognitive task to automate than, say, "surviving as a stray cat". |
![]() |
| Similar stuff is being done for material sciences where AI suggest different combinations to find different properties. So when people say AI(machine learning, LLM) are just for show I am a bit shocked as AI's today have accelerated discoveries in many different fields of science and this is just the start. Anna archive probably will play a huge role in this as no human or even a group of humans will have all the knowledge of so many fields that an Ai will have.
https://www.independent.co.uk/news/science/super-diamond-b26... |
![]() |
| I can get value out of them just fine. But I don't use LLMs to find answers, mostly to find questions. It's not really what they're being sold/hyped for, of course. But that's kinda my point. |
![]() |
| It's cool, no doubt. But keep in mind this is 20 years late:
https://en.wikipedia.org/wiki/Robot_Scientist |
![]() |
| Don't worry, it takes about 10 years for drugs to get approved, AIs will be superintelligent long before the government gives you permission to buy a dose of AI-developed drugs. |
![]() |
| “Drug repurposing for AML” lol
As a person who is literally doing his PhD on AML by implementing molecular subtyping, and ex-vivo drug predictions. I find this super random. I would truly suggest our pipeline instead of random drug repurposing :) https://celvox.co/solutions/seAMLess edit: Btw we’re looking for ways to fund/commercialize our pipeline. You could contact us through the site if you’re interested! |
![]() |
| I think hallucinate is a good term because when an AI completely makes up facts or APIs etc it doesn't do so as a minor mistake of an otherwise correct reasoning step. |
![]() |
| Exactly, they want to automate the most rewarding part that we don’t need help with… plus I don’t believe they’ve solved the problem of LLMs generating trite ideas. |
![]() |
| In science, having ideas is not the limiting factor. They're just automating the wrong thing. I want to have ideas and ask the machine to test for me, not the other way around. |
![]() |
| The difference is the complexity of ideas. There are straightforward ideas anyone can test and improve, and there are ideas where only PhDs in CERN can test |
![]() |
| This reminds me of a paper: "The ALCHEmist: Automated Labeling 500x CHEaper Than LLM Data Annotators"
https://arxiv.org/abs/2407.11004 In essence, LLMs are quite good at writing the code to properly parse large amounts of unstructured text, rather than what a lot of people seem to be doing which is just shoveling data into an LLM's API and asking for transformations back. |
![]() |
| > This feels like hubris to me.
No, any scientist has hundreds of ideas they would like to test. It's just part of the job. The hard thing is to do the rigorous testing itself. |
![]() |
| The market seems excited to charge in whatever direction the weathervane has last been pointing, regardless of the real outcomes of running in that direction. Hopefully I’m wrong, but it reminds me very much of this study (I’ll quote a paraphrase)
“A groundbreaking new study of over 1,000 scientists at a major U.S. materials science firm reveals a disturbing paradox: When paired with AI systems, top researchers become extraordinarily more productive – and extraordinarily less satisfied with their work. The numbers tell a stark story: AI assistance helped scientists discover 44% more materials and increased patent filings by 39%. But here's the twist: 82% of these same scientists reported feeling less fulfilled in their jobs.” Quote from https://futureofbeinghuman.com/p/is-ai-poised-to-suck-the-so... Referencing this study https://aidantr.github.io/files/AI_innovation.pdf |
![]() |
| So I'm a biomedical scientist (in training I suppose...I'm in my 3rd year of a Genetics PhD) and I have seen this trend a couple of times now where AI developers tout that AI will accelerate biomedical discovery through a very specific argument that AI will be smarter and generate better hypotheses than humans.
For example in this Google essay they make the claim that CRISPR was a transdisciplinary endeavor, "which combined expertise ranging from microbiology to genetics to molecular biology" and this is the basis of their argument that an AI co-scientist will be better able to integrate multiple fields at once to generate novel and better hypothesis. For one, what they fail to understand as computer scientists (I suspect due to not being intimately familiar with biomedical research) is that microbio/genetics/mol bio are closer linked than you may expect as a lay person. There is no large leap between microbiology and genetics that would slow down someone like Doudna or even myself - I use techniques from multiple domains in my daily work. These all fall under the general broad domain of what I'll call "cellular/micro biology". As another example, Dario Amodei from Claude also wrote something similar in his essay Machines of Loving Grace that the limiting factor in biomedical is a lack of "talented, creative researchers" for which AI could fill the gap[1]. The problem with both of these ideas is that they misunderstand the rate-limiting factor in biomedical research. Which to them is a lack of good ideas. And this is very much not the case. Biologists have tons of good ideas. The rate limiting step is testing all these good ideas with sufficient rigor to either continue exploring that particular hypothesis or whether to abandon the project for something else. From my own work, the hypothesis driving my thesis I came up with over the course of a month or two. The actual amount of work prescribed by my thesis committee to fully explore whether or not it was correct? 3 years or so worth of work. Good ideas are cheap in this field. Overall I think these views stem from field specific nuances that don't necessarily translate. I'm not a computer scientist, but I imagine that in computer science the rate limiting factor is not actually testing out hypothesis but generating good ones. It's not like the code you write will take multiple months to run before you get an answer to your question (maybe it will? I'm not educated enough about this to make a hard claim. In biology, it is very common for one experiment to take multiple months before you know the answer to your question or even if the experiment failed and you have to do it again). But happy to hear from a CS PhD or researcher about this. All this being said I am a big fan of AI. I try and use ChatGPT all the time, I ask it research questions, ask it to search the literature and summarize findings, etc. I even used it literally yesterday to make a deep dive into a somewhat unfamiliar branch of developmental biology more easy (and I was very satisfied with the result). But for scientific design, hypothesis generation? At the moment, useless. AI and other LLMs at this point are a very powerful version of google and code writer. And it's not even correct 30% of the time to boot so you have to be extremely careful when using it. I do think that wasting less time exploring hypotheses that are incorrect or bad is a good thing. But the problem here is that we can pretty easily identify good and bad hypotheses already. We don't need AI for that, what takes time is the actual amount of testing of these hypotheses that slows down research. Oh and politics, which I doubt AI can magic away for us. [1] https://darioamodei.com/machines-of-loving-grace#1-biology-a... |
![]() |
| I recently ran across this toaster-in-dishwasher article [1] again and was disappointed that the LLMs I have access to could replicate the "hairdryer-in-aquarium" breakthrough (or the toaster-in-dishwasher scenario, although I haven't explored it as much), which has made me a bit skeptical of the ability of LLMs to do novel research. Maybe the new OpenAI research AI is smart enough to figure it out?
[1] https://jdstillwater.blogspot.com/2012/05/i-put-toaster-in-d... |
![]() |
| > I don't think generating hypotheses is where AI is useful,
> Generating hypotheses is the fun, exciting part that I doubt scientists want to outsource to AI The latter doesn’t imply the former |
![]() |
| Just as the invention of writing degraded human memory (before that they memorized whole stories, poems), with the advent of AI, humans will degrade their thinking skills and knowledge in general. |
> We applied the AI co-scientist to assist with the prediction of drug repurposing opportunities and, with our partners, validated predictions through computational biology, expert clinician feedback, and in vitro experiments.
> Notably, the AI co-scientist proposed novel repurposing candidates for acute myeloid leukemia (AML). Subsequent experiments validated these proposals, confirming that the suggested drugs inhibit tumor viability at clinically relevant concentrations in multiple AML cell lines.
and,
> For this test, expert researchers instructed the AI co-scientist to explore a topic that had already been subject to novel discovery in their group, but had not yet been revealed in the public domain, namely, to explain how capsid-forming phage-inducible chromosomal islands (cf-PICIs) exist across multiple bacterial species. The AI co-scientist system independently proposed that cf-PICIs interact with diverse phage tails to expand their host range. This in silico discovery, which had been experimentally validated in the original novel laboratory experiments performed prior to use of the AI co-scientist system, are described in co-timed manuscripts (1, 2) with our collaborators at the Fleming Initiative and Imperial College London. This illustrates the value of the AI co-scientist system as an assistive technology, as it was able to leverage decades of research comprising all prior open access literature on this topic.
The model was able to come up with new scientific hypotheses that were tested to be correct in the lab, which is quite significant.