I'm trying to make a program that deletes extra white spaces from a string. It should keep one space but delete any extra.
I wrote this code, which works up until the point I have to change the passed pointer's address to the new string I made in the function. I have read that this is not possible without making a pointer to a pointer. The problem is, for my assignment, I can't change the argument types.
Is there any way to do this without making a new string, or is there a way to change the pointer address of s
to point to newstr
?
My code:
void str_trim(char *s)
{
// your code
char newstr[200];
char *newstrp = newstr;
while (*s != '\0')
{
if (*s != ' ')
{
*newstrp = *s;
*newstrp++;
*s++;
}
else if (*s == ' ')
{
*newstrp = *s;
*newstrp++;
*s++;
while (*s == ' ')
{
s++;
}
}
else
{
*newstrp = *s;
*newstrp++;
*s++;
}
}
s = &newstr;
}
Edit: This is what I ended up using
char *p = s, *dp = s;
while (*dp == ' ')
dp++;
;
while (*dp)
{
if (*dp == ' ' && *(dp - 1) == ' ')
{
dp++;
}
else
{
*p++ = *dp++;
}
}
if (p > s && *(p - 1) == ' ')
{
p--;
}
*p = '\0';
I'm trying to make a program that deletes extra white spaces from a string. It should keep one space but delete any extra.
I wrote this code, which works up until the point I have to change the passed pointer's address to the new string I made in the function. I have read that this is not possible without making a pointer to a pointer. The problem is, for my assignment, I can't change the argument types.
Is there any way to do this without making a new string, or is there a way to change the pointer address of s
to point to newstr
?
My code:
void str_trim(char *s)
{
// your code
char newstr[200];
char *newstrp = newstr;
while (*s != '\0')
{
if (*s != ' ')
{
*newstrp = *s;
*newstrp++;
*s++;
}
else if (*s == ' ')
{
*newstrp = *s;
*newstrp++;
*s++;
while (*s == ' ')
{
s++;
}
}
else
{
*newstrp = *s;
*newstrp++;
*s++;
}
}
s = &newstr;
}
Edit: This is what I ended up using
char *p = s, *dp = s;
while (*dp == ' ')
dp++;
;
while (*dp)
{
if (*dp == ' ' && *(dp - 1) == ' ')
{
dp++;
}
else
{
*p++ = *dp++;
}
}
if (p > s && *(p - 1) == ' ')
{
p--;
}
*p = '\0';
Is there any way to do this without making a new string, or is there a way to change the pointer address of
s
to point tonewstr
?
There is no way to change the caller's variable without changing the argument type of s
to char**
. So, that leaves you with two choices:
You can change your function to return the new char*
pointer. However, you can't return a pointer to a local variable, which means you will have to malloc()
the new string, and then the caller will have to free()
it (probably not what you want to do), eg:
char* str_trim(char *s)
{
char *newstr = malloc(200);
//...
return newstr;
}
char *s = str_trim(...);
...
free(s);
Online Demo
Otherwise, simply get rid of the local char[]
buffer altogether, and just modify the contents of s
in-place (this is your only option if you can't change the signature of str_trim()
at all), eg:
void str_trim(char *s)
{
char *newstr = s;
while (*s != '\0')
{
if (*s != ' ')
{
*newstr = *s;
++newstr;
++s;
}
else
{
*newstr = *s;
++newstr;
++s;
while (*s == ' ')
{
++s;
}
}
}
*newstr = '\0';
}
char s[] = "...";
str_trim(s);
Online Demo
The problem is, for my assignment, I can't change the argument types.
Let's consider the specified function declaration
void str_trim(char *s);
Its single parameter has a pointer to a non-constant string (its first character). It means that the function may change a passed string. It is a contract between the user of the function and the function functionality.
So we can conclude taking also into account the function return type void
that a passed string must be changed in place. No other character array is created in the function and returned from the function.
In any case such a declaration of a character array within the function like that
char newstr[200];
with the magic number 200
does not make sense. The function should be able to process a string of any size.
Usually such string functions should return a pointer to updated strings. So I would declare the function like
char * str_trim(char *s);
Also your function has redundant code
if (*s != ' ')
{
*newstrp = *s;
*newstrp++;
*s++;
}
else if (*s == ' ')
{
*newstrp = *s;
*newstrp++;
*s++;
while (*s == ' ')
{
s++;
}
}
else
{
*newstrp = *s;
*newstrp++;
*s++;
}
And moreover the else
part of the if-else statements never will get the control because there are only two variants: either *s
is equal to ' '
or not equal to ' '
. There can not be a third variant.
Even your updated function is too complicated.
Here is a demonstration program that shows how the function can look very simple without redundant if-else statements and duplicated code.
#include <stdio.h>
char * str_trim( char *s )
{
for ( char *p = s, *q = s; *q; )
{
if ( *++q != ' ' || *p != ' ' )
{
if ( ++p != q )
{
*p = *q;
}
}
}
return s;
}
int main(void)
{
char s[] = "A string with many adjacent spaces.";
puts( s );
puts( str_trim( s ) );
return 0;
}
The program output is
A string with many adjacent spaces.
A string with many adjacent spaces.
If you want to preserve the original return type of the function void
then the function will look like
void str_trim( char *s )
{
for ( char *p = s, *q = s; *q; )
{
if ( *++q != ' ' || *p != ' ' )
{
if ( ++p != q )
{
*p = *q;
}
}
}
}
and in main you should write
puts( s );
str_trim( s );
puts( s );
Compare my function definition with function definitions in other answers. I think it is evident that my function definition is much simpler. Moreover my function does not overwrite a passed string if it does not contain adjacent spaces.
Pay attention to that the tab character '\t'
is also considered as a space. Try for example to output string "\tHello,\t\t\tWorld!"
puts( "\tHello,\t\t\tWorld!" );
So it is better to substitute an encountered tab character '\t'
for ' '
.
In this case the function can look the following way as shown below
#include <stdio.h>
#include <string.h>
char * str_trim( char *s )
{
if ( *s == '\t' ) *s = ' ';
for ( char *p = s, *q = s; *q; )
{
if ( !isblank( ( unsigned char )( *++q ) ) || !isblank( ( unsigned char )*p ) )
{
++p;
if ( *q == '\t' )
{
*p = ' ';
}
else if ( p != q )
{
*p = *q;
}
}
}
return s;
}
int main(void)
{
char s[] = "\tHello,\t\t\tWorld!";
puts( s );
puts( str_trim( s ) );
return 0;
}
The program output is
Hello, World!
Hello, World!
Coding goals apparently limit the updated string to modifications in place.
Since the new string with trimmed white-space, will never be longer than the original, the result can exist in s[]
.
Simply walk the string, noting is the previous character was a white-space. Increment a source pointer on each loop iteration and a destination pointer after each assignment.
Rather than look for ' '
, use isspace()
to detect all white-space.
is...()
functions accept an unsigned char
value or EOF
. Access the string with unsigned char *
to avoid negative char
values.
#include <ctype.h>
#include <stdbool.h>
void str_trim(char *s) {
unsigned char *us_src = (unsigned char*) s;
unsigned char *us_dest = (unsigned char*) s;
bool previous_isspace = false;
while (*us_src) {
if (isspace(*us_src)) {
if (previous_isspace) {
us_src++;
} else {
*us_dest++ = *us_src++; // or = ' ';
previous_isspace = true;
}
} else {
*us_dest++ = *us_src++;
previous_isspace = false;
}
}
*us_dest = '\0';
}
Is there any way to do this without making a new string, or is there a way to change the pointer address of s to point to newstr?
Yes, you can refer to functions, e.g., canonicalize_newline, remove_backslash_newline, ... etc., in chibicc repo, which handle tokenization without creating a new string.
// Replaces \r or \r\n with \n.
static void canonicalize_newline(char *p) {
int i = 0, j = 0;
while (p[i]) {
if (p[i] == '\r' && p[i + 1] == '\n') {
i += 2;
p[j++] = '\n';
} else if (p[i] == '\r') {
i++;
p[j++] = '\n';
} else {
p[j++] = p[i++];
}
}
p[j] = '\0';
}
Try using a similar approach by employing two pointers, i for reading and j for writing, to process the string in place.
static void remove_extra_whitespace(char *p) {
int i = 0, j = 0;
int space_found = 0;
while (p[i]) {
if (isspace(p[i])) {
if (!space_found) {
p[j++] = ' ';
space_found = 1;
}
} else {
p[j++] = p[i];
space_found = 0;
}
i++;
}
if (j > 0 && p[j - 1] == ' ')
j--;
p[j] = '\0';
}
Live Demo
char *str_trim(char *s)
and call ass = str_trim(s);
– chux Commented Jan 31 at 3:28malloc
or something, not return the address of a local array whose lifetime is about to end.) – user2357112 Commented Jan 31 at 3:35