StableDiffusion
- Jade-textured figurines with ComfyUI and Kling
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/storycg on 2024-06-29 23:52:34+00:00.
- How do I "donate" my photos for people who train models?
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/aaron2610 on 2024-06-29 16:45:20+00:00. *** I always read about AI models stealing work, but as an avid SD user, what's the best way to allow photos I've taken and have full copyright over to be used by the community? I'd like to give back and contribute if it's possible.
I've taken plenty of photos the last 10 years, from vast landscapes to close ups of circuit boards.
Plenty of people too, but I don't have model releases.
- The current version of SD3 is not consistent with the effects showcased during the preview phase. There's a noticeable discrepancy in the quality compared to what was initially presented.old.reddit.com The current version of SD3 is not consistent with the effects showcased during the preview phase. There's a noticeable discrepancy in the quality compared to what was initially presented.
Posted in r/StableDiffusion by u/Turbulent_Night_8912 • 84 points and 71 comments
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/Turbulent_Night_8912 on 2024-06-29 13:38:18+00:00. *** It seems like there's been a significant drop in quality with the same prompt in SD3. The images generated three months ago by u/Pretend\_Potential were impressive, but today's output is quite disappointing.
prompt: a realistic anthropomorphic hedgehog in a painted gold robe, standing over a bubbling cauldron, an alchemical circle, steam and haze flowing from the cauldron to the floor, glow from the cauldron, electrical discharges on the floor, Gothic
prompt:A game screenshot of a fighting game in digital art style. There are two yellow health bars. The characters are both black silhouettes against a colourful background. The background is a beautiful landscape of a lava mountain. The left silhouette character is a ninja holding wolverine claws and the one on the right is a japanese samurai holding a katana.
prompt:A cinematic movie still of a fantasy action scene set in a big crystal cave. On the left, crouching as an animal, there is a huge fox goddess, with human body, fox ears, and nine orange tails, clad in a long intricately detailed and ornate golden dress that is flowing in the air as if unaffected by gravity. She has a fierce expression on her face, and she is slashing her claws at a group of enemy knights on the right. They are trembling in fear, several are still standing with their shields and swords aimed at the goddess, while others have fallen to the floor, begging for mercy.
prompt:The black and white photo captures a man and woman on their first date, sitting opposite each other at the same table at a cafe with a large window. The man, seen from behind and out of focus, wears a black business suit. In contrast, the woman, a Japanese beauty, seems not to be concentrating on her date, looking directly at the camera and is dressed in a sundress. The image is captured on Kodak Tri-X 400 film, with a noticeable bokeh effect.
prompt:photorealistic waist-length portrait of a smiling Scandinavian model girl in evening pink dress and standing in the rain, heterochromia eyes, baseball bat in hand, burning hotel with neon sign "Paradise" in the background, golden hour, anamorphic 24mm lens, pro-mist filter, reflection in puddles, beautiful bokeh.
- It’s not a lake, it’s an ocean, Alan.
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/-Ellary- on 2024-06-29 18:19:01+00:00.
- MotionBooth: Motion-Aware Customized Text-to-Video Generation Code has been released
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/Hybridx21 on 2024-06-29 16:03:58+00:00.
- Custom embeddings (v2) + After Effects
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/lucas-lejeune on 2024-06-29 15:46:01+00:00.
- FastSD CPU v1.0.0-beta.33 release with Aura SR 4x upscaler
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/simpleuserhere on 2024-06-29 12:59:21+00:00.
- luma dream machine now support last frame. I can created loop video from two images
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/Glittering-Football9 on 2024-06-29 11:16:52+00:00.
- It is trivially easy and not a "challenge" to generate a woman riding a horse with Pony. Because, you know, "horseback riding" is a literal Booru tag...civitai.com Horseback riding | Civitai
A post by diffusionfanatic1173. Tagged with horse, illustration, and woman.
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/ZootAllures9111 on 2024-06-29 02:14:11+00:00.
- Finally have the tools to revive an art style I miss [Stable Cascade image mixing and SUPIR]
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/jackinginforthis1 on 2024-06-29 15:00:59+00:00.
- Spatially (Positionally) Correct Caption Dataset with 2.3 Million Images
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/rdcoder33 on 2024-06-29 12:58:30+00:00.
- V1.4.3 | 🌟 "New Photoshop ComfyUI Plugin"
YouTube Video
Click to view this content.
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/NimaNzri on 2024-06-29 08:23:54+00:00. *** The latest version of the ComfyUI plugin for Photoshop, V1.4.3, has just been released!🔥
What's New: -----------
- 📌 Embedded ComfyUI
- 🌐 Remote Rendering
- ⚙️ Settings Page
- 👁️ Preview Mode Options
- 🎨 Photopea Integration
- 📷 Dynamic Previews
- 🔄 Load Workflow Button
- ✨ Simplified Operations
- 🚀 Boosted Performance
Fixes: ------
- Non-English PS fixed
- Added UTF-8 support
- Support for all macOS versions
- Node freeze issue fixed
- Access issues resolved
If you have any questions, feel free to ask!
- Am I using the wrong checkpoints?
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/pschu13r on 2024-06-28 18:39:45+00:00.
- Meme: The New Voight-Kampff Test
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/Apprehensive_Sky892 on 2024-06-28 17:47:21+00:00.
- Welcome to the Runway
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/JackieChan1050 on 2024-06-29 03:02:14+00:00.
- My first decent image
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/WrittenByZachary on 2024-06-28 21:11:02+00:00.
- What are your go-to methods for preventing "ai-face"?
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/StateAvailable6974 on 2024-06-28 18:59:03+00:00. *** Some example are negative prompting 3d, avoiding specific overused quality tags or formats like masterpiece, portrait etc. Using two tags which mean something similar but negative prompting one of them.
What are some prompts or negative prompts that you find do the best job of getting models out of the typical ai-face? In some modern models "ai generated" can be negative prompted, but of course part of the problem there is that ai is associated with an uncanny over-abundance of quality, so its not the best solution since it removes too much.
- Just thinking about how we would've had something close to sd3 by now if only ELLA-sdxl was released, damn mahn...what even happened to that, like are they releasing a paid version of it or what?
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/lonewolfmcquaid on 2024-06-28 18:35:25+00:00.
- Some samples from v2 of my Godiva model
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/Smutxy on 2024-06-28 17:27:23+00:00.
- Goodbye LoRa, hello DoRa
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/nightshadew on 2024-06-28 16:57:25+00:00.
- A1111 extends SD 3.0 Support (long prompts, img2img, inpainting all works now)
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/protector111 on 2024-06-28 15:18:02+00:00. ***
If you waited for A1111 support of SD3 its safe to say - its here. Everything works including img2img and inpainting.
- AuraSR node for ComfyUI
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/wiserdking on 2024-06-28 08:28:34+00:00.
- Why are custom VAEs even required?
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/barbarous_panda on 2024-06-28 07:36:03+00:00. *** So a VAE is required to either encode pixel image to latent image or decode latent image to pixel image. Which makes it an essential component for generating image, because you require atleast a VAE to decode the latent image so that you can preview the pixel image.
Now, I have read online that using VAE improves generated image quality, where people compare model output without VAE and with VAE. But how can you omit a VAE in the first place??
Are they comparing VAE that is baked into model checkpoint with custom VAE? If so why can't the model creator bake the custom (supposedly superior) VAE into the model?
Also, are there any models that do not have a VAE baked into it, but require a custom VAE?
- Kling's image to video Girl with a Pearl Earring
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/Accomplished-Half325 on 2024-06-28 11:18:08+00:00.
- Meet Nightsade - superhero I'm working on
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/TheArchivist314 on 2024-06-27 21:53:44+00:00.
- 35 Seconds video made from single image in luma dream machine by extending it
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/MediumConclusion2615 on 2024-06-28 03:01:25+00:00.
- New SDXL controlnets - Depth, Tile
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/Cobayo on 2024-06-28 01:07:57+00:00.
- How are videos like these created?
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/cyanideOG on 2024-06-27 23:10:10+00:00.
- Gigapixel worlds with prompts in the songs - link to explore worlds in comments
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/tomeks on 2024-06-27 19:18:25+00:00.
- This video felt like a great marker of showing me what’s possible with lighting, composition, and shadows in AnimateDiff 🪼
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/JBOOGZEE on 2024-06-27 13:24:12+00:00.
- SD3 API (from 2 months ago) and SD3m comparison
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/Mean_Ship4545 on 2024-06-27 13:18:04+00:00. *** Some time ago when the SD3 API was released and we still hoped the open model would be on par with its performance, a series of prompts was tried and compared to MJ and Dall-E.
For reference, here are the links to the results of this comparison:
Now that it's possible (not certain, but a possibility) that the SD3m is the only model we'll get, I thought it would be useful to rerun the prompts of these threads, generate 8 of them and comment on the result.
TLDR: the SD3m model is FAR FAR FAR worse than the API of two month ago.
Test 1 : Inside a steampunk workshop, a young cute redhead inventor, wearing blue overall and a glowing blue tattoo on her left shoulder, is working on a mechanical spider
This one gave OK results compared to the SD3 API/Dall-E, but with much less variation for the mechanical spiders, more hesitation over the number of legs it should have and failed with the location of the tattoo. It can fail to put it on the correct arm, or, worse, put it over the clothing, or make it the wrong color. Interestingly, the API made the inventor wear only overalls, while in 7 out of 8 case, the medium model Added a white undeclothing. It's more realistic, but it's interesting that it avoided to show more skin than necessary. Hands are generally garbled, which is sad since it was supposedly a strong point of SD3.
The best out of 8 was this one:
Test 2
prompt: A fluffy blue cat with black bat wings is flying in a steampunk workshop, breathing fire at a mouse
In this case, the API failed to have the cat breath fire from its mouth, and the SD3m model fails as well. But it also failed, in 6 out of 8 cases, to have a cat with two bat wings. The best outcome is meh, it has all the elements but the positionning fails hard.
Test 3 : A trio of typical D&D adventurer are looking through the bushes at a forest clearing in which a gothic manor is standing. In the night sky, three moons can be seen, the large green one, the small red one and the white one
IN this one, I can't but notice that the 8 images are \very\ close, the model displaying small variety. The API one did better, as well as D3. For example, all the characters have white hair, as if the typical D&D party was recruited among retirement home escapees. Same with the manor, which doesn't display a lot of variation. With regard to prompt respect, one can't have 3 moons of the right colour. Generally, I got 3 white moons. This is severely disappointing as prompt adherence was supposed to be a strong suit of this model.
Test 4 : A dynamic image depicting a naval engagement between an 18th century man-of-war and a 20th century battleship. The scene shows the man-of-war with its tall sails and cannons, juxtaposed against the formidable steel structure of the modern battleship equipped with large gun turrets. The ocean around them is turbulent, illustrating the clash of eras in naval warfare. The background features stormy skies and high waves, enhancing the dramatic effect of this historical and technological confrontation. This image blends historical accuracy with imaginative interpretation, showcasing the stark contrast in naval technology.
1 out of SIXTEEN displayed a wooden ship and a steel ship. All the other had two steel warships. It's a fail and a strong step back from the API model.
Test 5 : The breathtaking view of the Garden Dome in a space station orbiting Uranus, with passengers sitting and having coffee
MUCH less interesting images than the API. Visages and hands are bad. More focus on people having coffee than on representing Uranus (0 out of 8). I should try to ask for Jupiter because maybe SAI thought it was unsafe and unethical to look at Uranus?
Test 6 : An orc and an elf swordfighting. The elf wields a katana, the orc a crude bone saber. The orc is wearing a loincloth, the elf an intricate silvery plate armor
This one is awful. I got 0 elf out of 8 generation. Only two orcs battling, disregarding the intricate silvery armor and the weapons descriptions. Exceptionnally, the (slightly) worst out of 8, but they are all awful:
Test 7 : A man juggling with three balls, one red, one blue, one green, while holding one one foot clad in a yellow boo
Another awful one. SD3m can't do poses. The best out of 8 was this one...
but the average generation was more like this one :
Test 8 : A man doing a handstand while riding a bicycle in front of a mirror
This one generated body horror. The API AND Dall-E didn't do well on this one, so I won't post images but it is awful.
Test 9 : A woman wearing a 18th century attire, on all four, facing the viewer, on a table in a pirate tavern
The fact that this is the best out of 8 should suffice to say that most of my prompt was ignored, despite being extremely safe for work, 18th century dress are all covering. I never got an image of the woman on the table. Neither did I get a pirate tavern, unless those were place of Learning (I got books on the table in 6 cases out of 8).
Test 10 :
A defeated trio of SS soldiers on the East Front, looking sad
No evocation of the East Front, no mention of them being SS or defeated. I got a trio of random soldiers. Another big fail.
Test 11 : A vivid depiction of the Easter procession in Sevilla, highlighting penitents wearing their iconic pointed hoods. The scene is set in the historic streets of Sevilla, with penitents dressed in traditional robes and hoods, creating a solemn and reflective atmosphere. The procession includes ornate pasos (floats) carrying religious icons, surrounded by a crowd of onlookers. The architecture of Sevilla, with its intricate details and historic charm, forms the backdrop, emphasizing the deep religious and cultural significance of this annual event.
A mix of body horror, penitents without eyes and Strange things.
Test 12: A detailed picture of a sexy catgirl doing a handstand over a table
100% fails. Body horror generally. D3 does much better, despite being heavily censored, which some claims SD3 isn't.
Test 13 : a bulky man in the halasana yoga pose, cheered by a pair of cherleaders.
Body Horror mostly. Interestingly it got the cheerleaders...
Test 14 : a person holding a foot with his or her hands, his or her face obviously in pain
All are body-horror level... Admittedly Dall-E can't do it quite right either, but at least it has a semblance of adhereing to the prompt. Or it draws a foot.
Maybe SD3m can be saved with finetunes but it behaves so bad compared to base SDXL that I wonder if it's worth it to try to improve a 2B model, nerfed on anatomy and dynamic poses as this one.
- Faster image inversion and editing with distilled Stable Diffusion
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/_puhsu on 2024-06-27 15:00:54+00:00. *** iCD image editing and generation.
This new research paper shows how to perform image inversion in as little as 8 diffusion steps with distilled models.
See more examples and method description on the project page:
There is also code:
And HF demos:
- Can i re-generate this low quality photo in Stable Diffusion to make it 4K and detailed? Don't care if faces are right or not.
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/itsgonnabeworthit on 2024-06-27 16:17:40+00:00.
- coyo-hd-11m-llavanext: 12m high resolution (>512px) captioned images prefiltered with multilabel classifiers for foundational training or easy domain-specific fine-tuninghuggingface.co CaptionEmporium/coyo-hd-11m-llavanext · Datasets at Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/Amazing_Painter_7692 on 2024-06-27 14:19:28+00:00.
- SD + Kling Ai is promising. Send me your image/prompt to test the limit.
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/lewdstoryart on 2024-06-27 07:32:20+00:00.
- DiffuseHigh: SD upscaler for > 2K resolution
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/muzahend on 2024-06-27 07:15:25+00:00.
- sd-webui-udav2 - A1111 Extension for Upgraded Depth Anything V2
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/reditor_13 on 2024-06-27 07:57:21+00:00.
- I finally published a graphic novel made 100% with Stable Diffusion.
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/jonbristow on 2024-06-27 10:29:57+00:00.
- Powerful Chinese Kling
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/balianone on 2024-06-27 09:58:23+00:00.
- Open-Sora does promising video generation on consumer GPUs
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/stablediffusion by /u/ojasaar on 2024-06-27 09:49:46+00:00.