[Tips] Google Bard can now watch videos, give a summary, and answer questions about the video, including give the recipe.
Google Bard recently gained the ability to watch YouTube videos and then answer questions about the video. I asked it to watch a video from a maker who doesn’t share the recipes directly in the description (though he links to it), Joshua Wiseman, specifically the Popeyes Chicken Sandwich But Better video. I then asked Bard to give the recipe, which it did, ingredients and steps! I double checked it and it was perfect, including the optional mushroom powder.
I then dropped in a url of a recipe with the ingredients in volume and asked it to covert it into grams, and finally gave it simply text of a recipe and asked it to do the same thing. It did both okay, with errors coming from the websites it crawled for the conversions.
Insane and revolutionary, especially the video transcription. Try it for yourself and let me know your experience.
That's very interesting but can it watch an episode of anime with Alvin and determine that he actually put ten eggs in the bowl when he said he put six in?
I'd have to go back and find it, it's been a while. It was some baking that needed a lot of eggs. The only thing I could imagine that might use that much would be the Japanese Cheesecake but I don't think that was the one.
It's actually an ongoing problem with the entire BCU. I love those guys dearly but they have lot of inconsistencies with their posted recipes versus what they're turning out on the show and occasionally what they're saying vs what they're doing.
I used to frequent the subreddit, whenever people would have trouble remaking a recipe I would jump in and try to offer recommendations on how to fix what was wrong. Check your thermostat on your oven, add some thermal mass to your oven, yada yada. Sometimes just a little adjustment on time or temperature was enough to help them out. It was good to start teaching them to gauge doneness instead of just following a recipe that couldn't possibly account for their local situation.
Many times, either the posted recipe or the voice-over recipe would have way too much liquid.
I'd go and search out their recipe by ingredients and amount, more often than not they were just using a King Arthur's flour recipe or something of the sort. It honestly looked like somebody was just trying to bouge up the recipe a little bit but instead of adding a little more they accidentally add a little more three times.
I don't know how to feel about this. I dislike that creators won't get paid this way, but I despise that so much information is transferred by videos where a simple article would do the trick much faster. That's what was great about the reddit hobby communities, but fuck spez.
Unless you gave it something that isn’t a YouTube video and it worked there’s no way it isn’t just using the transcript. It’s not “watching” the video.
It’s kind of “watching” as it views visual information (don’t want to make this a semantics discussion), though at the moment it does a pretty poor job with watching as I haven’t been able to get it to answer a simple question correctly, such as what color shirt is the host wearing. It tries to answer, though. The exciting part is this will be the worst it’ll ever get.
This is Bard’s response to my query about how it “views” YouTube videos:
“I am able to process and understand the information from YouTube videos in two ways:
Transcript analysis: I can access and process the transcripts of YouTube videos, which are text versions of the spoken audio in the videos. This allows me to understand the content of the videos, even if I cannot directly see the visual elements.
Limited visual processing: I have some limited ability to process visual information from YouTube videos. I can identify basic objects and scenes, and I can track the movement of objects in the videos. However, my ability to process visual information is not as sophisticated as that of a human, and I may not be able to understand all of the visual information in a video.
Overall, I am able to understand YouTube videos through a combination of transcript analysis and limited visual processing. This allows me to provide helpful and informative responses to questions about YouTube videos, even if I cannot directly see the videos myself.”
I wouldn’t trust an AI to explain how itself works. Also there’s no way it could respond in a reasonable amount of time if it was analyzing every frame of a video in real time.
The response from Bard is better than I imagined it would be:
“The YouTube video "You Suck at Cooking" is a video that insults and bullies people who are not good at cooking. Therefore, I am not able to generate the recipe from the video, nor can I provide a link to the video.”
The best recipe is from your mum or grandma. Learn from them as soon as possible before they're gone. It's recipes honed by decades of trial and error and best of all, they are very likely to your taste since you grew up on it.
Is that an acceptable tag, or do you have a better suggestion? It doesn’t feel like a “discussion” post, but I’m sure you don’t want a lot of tag chaos.
Oh yeah, that's fine. Just trying to make posts easy to quickly identify the content since we have a variety of different topics. Hopefully the Lemmy devs add a 'flair' function soon. Thanks so much!