> FWIW, this is pretty much what has been described as "waluigi" effect a bit extended
Sorry I disagree for some reasons. First, turning the lights on is literally the only thing the bot can do to heat up the house at all. Turning on the lights does heat it up a little bit. So it's the right answer. Second, that's not the Waluigi effect, not even 'pretty much' and not even 'a bit extended'. Both of them are talking about things LLMs say, but other than that no.
The Waluigi effect applied to this scenario might be like, you tell the bot to make the house comfortable, and describe all the ways that a comfortable house is like. Then by doing this you have also implicitly told the bot how to make the most uncomfortable house possible. Its behavior is only one comfortable/uncomfortable flip away from creating a living hell. Say that in the course of its duties the bot is for some reason unable to make the house as comfortable as it would like to be able to do. It might decide that it didn't do it, because it's actually trying to make the house uncomfortable instead of comfortable. So now you got a bot turning your house into some haunted house beetlejuice nightmare.
Sorry I disagree for some reasons. First, turning the lights on is literally the only thing the bot can do to heat up the house at all. Turning on the lights does heat it up a little bit. So it's the right answer. Second, that's not the Waluigi effect, not even 'pretty much' and not even 'a bit extended'. Both of them are talking about things LLMs say, but other than that no.
The Waluigi effect applied to this scenario might be like, you tell the bot to make the house comfortable, and describe all the ways that a comfortable house is like. Then by doing this you have also implicitly told the bot how to make the most uncomfortable house possible. Its behavior is only one comfortable/uncomfortable flip away from creating a living hell. Say that in the course of its duties the bot is for some reason unable to make the house as comfortable as it would like to be able to do. It might decide that it didn't do it, because it's actually trying to make the house uncomfortable instead of comfortable. So now you got a bot turning your house into some haunted house beetlejuice nightmare.