• NMI Moderators: Snafu in the Void

Open Source Harm Reduction Chatbot

Cheshireai

Greenlighter
Joined
Dec 8, 2023
Messages
2
I'm an amateur AI enthusiast with an interest in training open source LLM models. To simplify it for anyone who's not familiar, open source LLM's are chatbot models (like a program) similar to ChatGPT, but can be run on a personal computer without the need for any internet access. So it's possible to ask a question, and you don't need to worry about your information or chat history being sent back to some company mining your data for god knows what. It never has to leave your computer or phone.

What I need to do is create a set of questions and answers that will be used to to train the model. Things like, factual questions and answers about the effects of drugs and their routes of administration, correction of common myths or misconceptions, and emotional support dealing with isolation, paranoia, or other crises. So it could basically be like a personal judgement free companion that can answer drug related questions with infinite patience and empathy.

Obviously it's going to be a lot of work; I need to make sure that it's not hallucinating dangerously wrong information. But it seems like something like this is inevitable (other people do models for general therapy, medical information, finance, etc), and something like this could potentially do some good eventually. I also do want to open source the training datasets, so if for some reason I can't continue with the project, anyone else with beginner level machine learning skills could easily pick up where I left off, or even just fork the project if they hate my guts and think they can do it better.

What I'm looking for is some feedback on the concept, and maybe some direction for collecting the kind of data I need. The main thing for something like this is, the quality is insanely important. The accuracy and trustworthiness of the information needs to be beyond reproach. Scraping random forum threads and hoping that they're mostly good quality is not really an option. If anyone has leads for textbook quality sources data that revolves around harm reduction, coping with addiction, or supporting loved ones with addiction, I'd greatly appreciate any leads. Also, I didn't really know where to post this, so if there's some other place I should be asking, or even other platforms. I'm open to any and all suggestions.

Thank you for reading.
 
I'm an amateur AI enthusiast with an interest in training open source LLM models. To simplify it for anyone who's not familiar, open source LLM's are chatbot models (like a program) similar to ChatGPT, but can be run on a personal computer without the need for any internet access. So it's possible to ask a question, and you don't need to worry about your information or chat history being sent back to some company mining your data for god knows what. It never has to leave your computer or phone.

What I need to do is create a set of questions and answers that will be used to to train the model. Things like, factual questions and answers about the effects of drugs and their routes of administration, correction of common myths or misconceptions, and emotional support dealing with isolation, paranoia, or other crises. So it could basically be like a personal judgement free companion that can answer drug related questions with infinite patience and empathy.

Obviously it's going to be a lot of work; I need to make sure that it's not hallucinating dangerously wrong information. But it seems like something like this is inevitable (other people do models for general therapy, medical information, finance, etc), and something like this could potentially do some good eventually. I also do want to open source the training datasets, so if for some reason I can't continue with the project, anyone else with beginner level machine learning skills could easily pick up where I left off, or even just fork the project if they hate my guts and think they can do it better.

What I'm looking for is some feedback on the concept, and maybe some direction for collecting the kind of data I need. The main thing for something like this is, the quality is insanely important. The accuracy and trustworthiness of the information needs to be beyond reproach. Scraping random forum threads and hoping that they're mostly good quality is not really an option. If anyone has leads for textbook quality sources data that revolves around harm reduction, coping with addiction, or supporting loved ones with addiction, I'd greatly appreciate any leads. Also, I didn't really know where to post this, so if there's some other place I should be asking, or even other platforms. I'm open to any and all suggestions.

Thank you for reading.
@Shady's Fox
 
Welcome to bluelight. Tell us a little more about your self, what brought ya here?


Though, this idea sounds interesting, do you have a discord?
 
Welcome to bluelight. Tell us a little more about your self, what brought ya here?


Though, this idea sounds interesting, do you have a discord?
When I was younger, I was involved with drug policy reform and activism; I used to volunteer for organizations like MAPS and some activist groups. But life started to happen and I didn't really have the time I needed for those kinds of things anymore. I smoke a lot of weed and dabble with psychedelics, mostly mushrooms. I used to lurk bluelight a lot for drug information but never really saw the need to create an account and participate.

My main interest is learning about AI and training chatbots. My primary project is creating a NSFW chatbot that can do erotic roleplay and story-writing without censorship or restrictions. When I posted on various forums/subreddits about what I was trying to do, I got a LOT of feedback, and people really helped point me to where I needed to go to find the data I needed. All the data I curate and organize I release for free so people can see how I did it and hopefully improve on it. Here's what I've released so far for openerotica. I'm a one person show so it's slow going but I'm trying to learn as fast as I can.

I do have discord, @Cheshireai. I also created a server for the purpose of working on this project: https://discord.gg/mCPGGuXR
 
Top