On 11/17/21 08:36, Jinoh Kang wrote:
+static void packet_reply_add_raw(struct gdb_context* gdbctx, const void* data, size_t len) +{ + const unsigned char *ptr = data, *sptr, *end = ptr + len; + size_t seg_len; + int is_end;
I'm not sure what "seg_" is supposed to mean? How about just "len", and "tmp" instead of sptr?
"seg" stands for segments (substrings), separated by special characters to be escaped. Ditto for "s" in "sptr".
That said, I agree "tmp" would work just as fine. Since "len" is already taken, maybe something like "tmplen" would do?
+ while (ptr != end) + { + for (sptr = ptr; sptr != end && !is_gdb_special_char(*sptr); sptr++) + ;
+ seg_len = sptr - ptr; + is_end = sptr == end;
+ packet_reply_grow(gdbctx, is_end ? seg_len : seg_len + 2); + memcpy(&gdbctx->out_buf[gdbctx->out_len], ptr, seg_len); + gdbctx->out_len += seg_len; + ptr += seg_len;
+ if (is_end) + break;
+ gdbctx->out_buf[gdbctx->out_len++] = 0x7D; + gdbctx->out_buf[gdbctx->out_len++] = 0x20 ^ *ptr++; + } +}
What about something like:
Thanks a lot for taking time to improve it. A few comments, though:
const unsigned char *begin = data, *end = begin + len, *ptr = begin;
I'm not sure the name "begin" is clear enough, due to the asymmetry of semantics of "begin" and "end".
- "begin" denotes the start of the subsequence we are currently working on.
- "end" denotes the end of the entire input data.
Instead of "begin", how about "segment", "part", or "curr"? Alternatives include "fragment" and "curr_part", but they don't sound as attractive.
Sure, "tmp" or "curr" should be generic enough to not worry about the semantics anymore.
while (ptr != end) { while (ptr != end && !is_gdb_special_char(*ptr)) ptr++;
packet_reply_grow(gdbctx, ptr - begin + 2);
I see that the extra "is_end" variable doesn't help with readability.
If we're rewriting for clarity, how about calling "packet_reply_grow" twice instead of unconditionally growing the buffer for the extra two bytes of escape sequence? The reasons are:
Since it always allocates two more bytes at the end of the data, "packet_reply_add_raw" ends up eating extra 2B of space for each call (2*n).
It should be more often the case of "ptr == end" than "is_gdb_special_char()". Special characters are rare, and it's better to optimise for the more likely path.
Works for me too, although I don't think the extra bytes would change anything in general as the allocated areas are probably rounded anyway.
memcpy(&gdbctx->out_buf[gdbctx->out_len], begin, ptr - begin); gdbctx->out_len += ptr - begin; if (ptr == end) break;
gdbctx->out_buf[gdbctx->out_len++] = 0x7D; gdbctx->out_buf[gdbctx->out_len++] = 0x20 ^ *ptr++; begin = ptr;
I suppose this assignment could be moved to the beginning of the loop body...
}
Could do with a len variable too if you feel like it's better.
...unless we're using tmp_len instead of "ptr - begin", of course.
static inline void packet_reply_add(struct gdb_context* gdbctx, const char* str) { - int len = strlen(str); - packet_reply_grow(gdbctx, len); - memcpy(&gdbctx->out_buf[gdbctx->out_len], str, len); - gdbctx->out_len += len; + packet_reply_add_raw(gdbctx, str, strlen(str)); } static void packet_reply_open(struct gdb_context* gdbctx) { assert(gdbctx->out_curr_packet == -1); - packet_reply_add(gdbctx, "$"); + packet_reply_grow(gdbctx, 1); + gdbctx->out_buf[gdbctx->out_len++] = '$'; gdbctx->out_curr_packet = gdbctx->out_len; } @@ -773,7 +826,8 @@ static void packet_reply_close(struct gdb_context* gdbctx) int plen; plen = gdbctx->out_len - gdbctx->out_curr_packet; - packet_reply_add(gdbctx, "#"); + packet_reply_grow(gdbctx, 1); + gdbctx->out_buf[gdbctx->out_len++] = '#'; cksum = checksum(&gdbctx->out_buf[gdbctx->out_curr_packet], plen); packet_reply_hex_to(gdbctx, &cksum, 1); gdbctx->out_curr_packet = -1; @@ -812,8 +866,6 @@ static enum packet_return packet_reply(struct gdb_context* gdbctx, const char* p { packet_reply_open(gdbctx); - assert(strchr(packet, '$') == NULL && strchr(packet, '#') == NULL);
packet_reply_add(gdbctx, packet); packet_reply_close(gdbctx);
After this change packet_reply_add helper seems a little useless, I suggest maybe:>
- Keep packet_reply_add like it is, and use it everywhere some static strings are added, including in open/close here, where we know we don't need escaping.
My intention was to repurpose packet_reply_add to be used *only* for emitting payload data in GDB packets, and not sending arbitrary data to the socket. Otherwise, someone else might end up using the wrong function and run into hard-to-debug protocol errors. By repurposing it, the code clearly draws the line between the (high-level) GDB protocol and the underlying raw serial transport.
Also, the overhead of scanning short string literals should not be significant.
Yeah I wasn't really worried about that, more about the mostly empty helper that's only calling the _raw version.
Maybe we can have just a single "packet_reply_add" helper which always escape the data, and not call it for the '$' and '#' cases?