I think you are missing a random effect in your formula. Response yiab depends on the fixed effects + an error term with 5 components.
εiab+εa|b+εb+xβa∣b+xβb
In order, from left to right, these components have the following interpretations:
- The pure error (personal to each observation)
- Variation due to different levels of A within a common B level
- Variation due to different levels of B
- How A affects the slope of the x relationship given common level B
- How level B affects the slope of x
You can't allow σ to vary with the level of A, because the model would no longer be identifiable (too many parameters all doing the same job). Unless, the variation depends on known weights (like group counts) -- in that case, you would still have the same number of parameters. Remember that we don't know the values of the levels of A (or B), but we estimate them under the assumption of a fixed variance. We need to assume some kind of regularity here.
Edit: @Amoeba questions this and I may have been mistaken about the possibility of different values of the variance of the observations. I misread the OP's question, actually. I was thinking of the variance of the α hidden effects, and not the pure error of the individual observations. Since the A and B levels are random, presumably, the variances should be considered random effects also, which means that some sort of regularization should be applied in estimating them, as is the case with the random effects of the A and B levels themselves.
It gets worse. The value of the mixed effects model is that it allows you to form confidence intervals for untested situations (Levels of A and B not included in the model), so you would definitely need to place a distribution on the variances and adjust your confidence intervals accordingly. It sounds pretty ugly.
And for sure, you are going to need a lot of data for this to work well, since we are talking about estimating variances as well as means.
As for the Welch test, that's basically a kludge applied to what used to be called the Behrens-Fisher problem - the problem of testing for the difference of two means when the variances are unequal. If memory serves, the problem is that you don't have a sufficient statistic of fixed dimension on that one.
To me, the question is why that problem should even admit to a meaningful solution. What does it actually mean to compare means when the variances are unequal? Imagine two models of car. Cars from model A typically have a limited and predictable number of repairs each year. Cars from model B are sometimes lemons and sometimes superb. What does it mean to compare the average costs of ownership in this case? But that's what we're talking about when the variances of the levels are allowed to change. How much sense does it really make to compare means when the variances are allowed to vary? It suggests you are comparing apples and oranges.
Reference. Since you seem to be using R for this, you might want to read Bates and Pinheiro's book Mixed effects models in S-plus, since they wrote the code for R's nlme and lme4 packages. That book goes into all the details you could possibly need. They do allow for correlations among the observations with a common level.