This doesn't scale, and it's going to go badly if some app passes a lot of data. If it's really hard to process the entire input, then reading only a few bytes would be better that reading everything.
I know, there's still a lot missing. But shouldn't we first have a working implementation and then care about scalability?
I'll make it into something like reading line for line (need to test that out first), but it's going to add more complexity - that's why I wanted to make that into extra patches.
Regards, Fabian Maurer