If by "seamless" you mean that silence at the beginning and the end of the sound should be trimmed away, that might be harder than it appears.
Generally, go at it like you were to trim a text string when you want to remove all leading and trailing whitespace: get the sample data of the recorded sample and "browse" through it until you find an actual sound sample (ie anything that is not silence) and save that position. Do the same in reverse, beginning at the end of the recorded sound. Use the positions that you saved to get a sub-sound of the recorded sound, that would be the trimmed sound that you could repeat "seamlessly".
The problem is that there is likely not going to be any true silence in the recorded sample but a certain amount of noise (especially when recording using a microphone). You will have to define a reasonable treshold where any sample that is quieter than the treshold will be considered absolute silence. It should not be too high, because that would trim away actual sound data, and it should not be too low either because then it won't fix the problem.
It also cannot be a fix value, because user A might speak louder than user B and the correct treshold would thus be different - so you'd have to do some pre-analysis of the sample as well (could be done while recording).